-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: [benchmark][cluster] dql request timeout in concurrent dql & multi-partition scene #38275
Comments
different case,same errorargo task: fouramf-concurrent-not-found server:
test steps:
|
same error test cases:
|
Related to milvus-io#38275 Make rootcoord describe collection execute without scheduler lock in order to remove deadlock introduced when sync partition and lock segment describe collection Signed-off-by: Congqi Xia <[email protected]>
this issue was caused by logic deadlock of CreatePartition(SyncPartition to QueryCoord) and load segmenting (describe collection) trying to solve this problem by move describe collection out of lock |
The fix seems to be not working very well. Right now, to keep proxy cache consistent, all describe collection from proxy need to wait the DDL lock, you can not skip it. could you explain why the dead lock happened? |
Related to milvus-io#38275 This PR move sync created partition step to proxy to avoid potential logic deadlock when create partition happens with target segment change. Signed-off-by: Congqi Xia <[email protected]>
Related to #38275 This PR move sync created partition step to proxy to avoid potential logic deadlock when create partition happens with target segment change. Signed-off-by: Congqi Xia <[email protected]>
verification passedargo task:fouramf-bitmap-scenes-tw86w argo task: fouramf-concurrent-xcq |
Is there an existing issue for this?
Environment
Current Behavior
argo task: fouramf-concurrent-not-found
test case name: test_hybrid_search_locust_dql_dml_partition_hybrid_search_cluster
server:
{pod=~"fouramf-concurrent-not-found-1-milvus-proxy-7f579d96b6-zt9l8"} |~ "5d26cb436d1ae7dd6dc18cab24c1b7cc"
The requery took a long time and caused timeout
client log:
Expected Behavior
No response
Steps To Reproduce
Milvus Log
No response
Anything else?
test result:
The text was updated successfully, but these errors were encountered: