-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: limit=10,but only 3 record is returned #38307
Comments
The title and description of this issue contains Chinese. Please use English to describe your issue. |
|
please download the documents(datas) from link : https://pan.quark.cn/s/391003bb76af |
Hi @Royhuiy , You set limit = 10, but only get 3 results, the most possible reason is there are too less entities in each bucket on average. I see your total entity count is 1234, and search with 'nlist = 1024, nprobe = 10'. |
/assign |
i set nlist = 32 , nprobe=4 , limit=10, but only 2 record is returned. |
Hi @Royhuiy , 'nlist' is a parameter to build IVF_FLAT index, if you change nlist to 32, you need rebuild your index. And, I cannot understand what do you mean "could you show me the code how to split and embedding" |
1、if I set nlist = 32, which type of the index should be seted? |
|
Hi @Royhuiy , I see "nlist = 32" now, if you search with "nprobe = 8/16/32", can you get different result count ? |
when I set query = "猫", always 3 results is returned nprobe = 8/16/32. |
Hi @Royhuiy , You use "document path" as the primary key ? You can use "auto_id" in this doc "https://milvus.io/docs/primary-field.md" |
you are right.The primary key is duplicated thanks for your help~ |
Welcome to use Milvus more and raise more issues :P |
/close |
@cydrain: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Is there an existing issue for this?
Environment
Current Behavior
使用bge-base-zh-v1.5对txt\docx\pdf实现embedding后。嵌入到milvus中。collection构建:
document_fields = [
FieldSchema(name="paragraph", dtype=DataType.INT64),
FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim = 768),
FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=2096),
FieldSchema(name="text_type", dtype=DataType.VARCHAR, max_length=2096),
FieldSchema(name="document_path", dtype=DataType.VARCHAR, is_primary=True, max_length=2096),
]
schema = CollectionSchema(document_fields )
collection = Collection(collection_name, schema)
创建索引以加快搜索速度
index_params = {"metric_type": "COSINE", "index_type": "IVF_FLAT", "params": {"nlist": 1024}}
collection.create_index("embedding", index_params)
paragraph:段落,embedding:向量,text:文本,text_type:文本类型(txt\docx\pdf),document_path:文档路径。
实现search的脚本:
def search_document_by_text(self, query_text:str = "", top_k:int = 10):
try:
limit=10,但只返回3条记录。
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: