Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: io.milvus.v2.exception.MilvusClientException: fail to Query on QueryNode 6: worker(6) query failed: getrandom #38265

Closed
1 task done
walker1024 opened this issue Dec 6, 2024 · 7 comments
Assignees
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@walker1024
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:v2.4.11
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):    kafka
- SDK version(e.g. pymilvus v2.0.0rc2):jdk8
- OS(Ubuntu or CentOS): CentOS7
- CPU/Memory: 2Core/4G
- GPU: 
- Others:

Current Behavior

When I perform a vector ANN search, if I request input parameters with outputfileds, the service will report an error.

but if I request without the outputfileds parameter, it will return normally

Expected Behavior

return normally

Steps To Reproduce

1、create collection
2、add data
3、ann search with "outputfileds" parameters, return error
4、ann search without "outputfileds" parameters, return normally

Milvus Log

{"log":"[2024/12/06 02:53:48.479 +00:00] [WARN] [proxy/impl.go:3478] ["Query failed to WaitToFinish"] [traceID=128f9f8ab5179b4d8f2ba0fb0534eb5a] [role=proxy] [db=default] [collection=img_coll_12_02] [partitions="[]"] [ConsistencyLevel=Strong] [useDefaultConsistency=false] [error="failed to query: failed to search/query delegator 6 for channel by-dev-rootcoord-dml_5_453943773523046964v0: fail to Query on QueryNode 6: worker(6) query failed: getrandom"] [errorVerbose="failed to query: failed to search/query delegator 6 for channel by-dev-rootcoord-dml_5_453943773523046964v0: fail to Query on QueryNode 6: worker(6) query failed: getrandom\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/proxy.(*queryTask).Execute\n | \t/workspace/source/internal/proxy/task_query.go:471\n | github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).processTask\n | \t/workspace/source/internal/proxy/task_scheduler.go:474\n | github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).queryLoop.func1\n | \t/workspace/source/internal/proxy/task_scheduler.go:553\n | github.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1\n | \t/workspace/source/pkg/util/conc/pool.go:81\n | github.com/panjf2000/ants/v2.(*goWorker).run.func1\n | \t/go/pkg/mod/github.com/panjf2000/ants/[email protected]/worker.go:67\nWraps: (2) failed to query\nWraps: (3) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry.func1\n | \t/workspace/source/internal/proxy/lb_policy.go:188\n | [...repeated from below...]\nWraps: (4) failed to search/query delegator 6 for channel by-dev-rootcoord-dml_5_453943773523046964v0\nWraps: (5) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/proxy.(*queryTask).queryShard\n | \t/workspace/source/internal/proxy/task_query.go:566\n | github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry.func1\n | \t/workspace/source/internal/proxy/lb_policy.go:180\n | github.com/milvus-io/milvus/pkg/util/retry.Do\n | \t/workspace/source/pkg/util/retry/retry.go:44\n | github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry\n | \t/workspace/source/internal/proxy/lb_policy.go:154\n | github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).Execute.func2\n | \t/workspace/source/internal/proxy/lb_policy.go:218\n | golang.org/x/sync/errgroup.(*Group).Go.func1\n | \t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1650\nWraps: (6) fail to Query on QueryNode 6\nWraps: (7) worker(6) query failed: getrandom\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.withPrefix (7) merr.milvusError"]\n","stream":"stdout","time":"2024-12-06T02:53:48.480521644Z"}
{"log":"[2024/12/06 02:53:48.480 +00:00] [WARN] [proxy/task_search.go:688] ["failed to requery"] [traceID=128f9f8ab5179b4d8f2ba0fb0534eb5a] [nq=1] [error="fail to Query on QueryNode 6: worker(6) query failed: getrandom"]\n","stream":"stdout","time":"2024-12-06T02:53:48.481061288Z"}
{"log":"[2024/12/06 02:53:48.481 +00:00] [WARN] [proxy/task_scheduler.go:485] ["Failed to post-execute task: "] [traceID=128f9f8ab5179b4d8f2ba0fb0534eb5a] [error="fail to Query on QueryNode 6: worker(6) query failed: getrandom"]\n","stream":"stdout","time":"2024-12-06T02:53:48.481649548Z"}

Anything else?

Looking at the service log, there is another ERROR log. I don’t know if it is related to this exception.

{"log":"[2024/12/06 02:37:21.795 +00:00] [WARN] [segments/segment_loader.go:702] ["load segment failed when load data into memory"] [traceID=13ff16ff055dabbe194120096cd95970] [collectionID=453943773523046964] [segmentType=Sealed] [requestSegments="[453943773523995151]"] [preparedSegments="[453943773523995151]"] [partitionID=453943773523046965] [segmentID=453943773523995151] [segmentType=L0] [error="At LoadDeltaLogs: parse magic number failed, expected: 16775868, actual: 993080882"] [errorVerbose="At LoadDeltaLogs: parse magic number failed, expected: 16775868, actual: 993080882\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/querynodev2/segments.(*segmentLoader).Load.func5\n | \t/workspace/source/internal/querynodev2/segments/segment_loader.go:720\n | github.com/milvus-io/milvus/pkg/util/funcutil.ProcessFuncParallel.func3\n | \t/workspace/source/pkg/util/funcutil/parallel.go:86\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1650\nWraps: (2) At LoadDeltaLogs\nWraps: (3) parse magic number failed, expected: 16775868, actual: 993080882\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString"]\n","stream":"stdout","time":"2024-12-06T02:37:22.291415582Z"}
{"log":"[2024/12/06 02:37:21.795 +00:00] [INFO] [segments/segment_loader.go:704] ["load segment done"] [traceID=13ff16ff055dabbe194120096cd95970] [collectionID=453943773523046964] [segmentType=Sealed] [requestSegments="[453943773523995151]"] [preparedSegments="[453943773523995151]"] [partitionID=453943773523046965] [segmentID=453943773523995151] [segmentType=L0]\n","stream":"stdout","time":"2024-12-06T02:37:22.291424364Z"}
{"log":"[2024/12/06 02:37:21.795 +00:00] [ERROR] [funcutil/parallel.go:88] [loadSegmentFunc] [error="At LoadDeltaLogs: parse magic number failed, expected: 16775868, actual: 993080882"] [errorVerbose="At LoadDeltaLogs: parse magic number failed, expected: 16775868, actual: 993080882\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/querynodev2/segments.(*segmentLoader).Load.func5\n | \t/workspace/source/internal/querynodev2/segments/segment_loader.go:720\n | github.com/milvus-io/milvus/pkg/util/funcutil.ProcessFuncParallel.func3\n | \t/workspace/source/pkg/util/funcutil/parallel.go:86\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1650\nWraps: (2) At LoadDeltaLogs\nWraps: (3) parse magic number failed, expected: 16775868, actual: 993080882\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString"] [idx=0] [stack="github.com/milvus-io/milvus/pkg/util/funcutil.ProcessFuncParallel.func3\n\t/workspace/source/pkg/util/funcutil/parallel.go:88"]\n","stream":"stdout","time":"2024-12-06T02:37:22.291433422Z"}

@walker1024 walker1024 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 6, 2024
@binbinlv
Copy link
Contributor

binbinlv commented Dec 6, 2024

/assign @aoiasd

could you help to have a look? Thanks.

@binbinlv
Copy link
Contributor

binbinlv commented Dec 6, 2024

And maybe similar issue #36271

@binbinlv binbinlv added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 6, 2024
@binbinlv
Copy link
Contributor

binbinlv commented Dec 6, 2024

@walker1024
could you please upload the complete logs and complete script you use if it is convenient?

Thanks.

@yanliang567
Copy link
Contributor

@walker1024 what types of fields did you set in the output_fields

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Dec 7, 2024
@walker1024
Copy link
Author

@walker1024 如果方便的话,您能上传您使用的完整日志和完整脚本吗?

谢谢。

@walker1024 what types of fields did you set in the output_fields

my collection has two fields: id and vector, so I set the output fields is id and vector.
collection字段

@walker1024
Copy link
Author

@walker1024 could you please upload the complete logs and complete script you use if it is convenient?

Thanks.

ok.
complete script

    FloatVec queryVector = new FloatVec(Lists.newArrayList(xxx));
    Integer topK = 5;
    SearchReq.SearchReqBuilder searchReq = SearchReq.builder()
            .collectionName("img_coll_12_02")
            .data(Collections.singletonList(queryVector))
            .topK(topK)
            .outputFields(Lists.newArrayList("id","vector"));
    
    SearchResp searchResp = milvusClient.annSearch(searchReq.build());
    if (searchResp != null && CollectionUtils.isNotEmpty(searchResp.getSearchResults())) {
        List<List<SearchResp.SearchResult>> searchResults = searchResp.getSearchResults();
        for (List<SearchResp.SearchResult> results : searchResults) {
            for (SearchResp.SearchResult result : results) {
                log.info("result :{}", JSON.toJSONString(result));
            }
        }
        
    }

complete logs

query_error.txt

@yanliang567
Copy link
Contributor

@walker1024 we need the milvus logs from all the pods, could you please provide all the logs? I tried to reproduce the issue in house, but no luck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

4 participants