Skip to content

DiscoveryEngineSearchTool auto-detect sends redundant CHUNKS searches under concurrent first use #6101

@glaziermag

Description

@glaziermag

Summary

DiscoveryEngineSearchTool(search_result_mode=None) appears to use CHUNKS-first auto-detect and cache _search_result_mode only after the structured-datastore fallback. Under concurrent first use of a fresh shared tool instance, multiple callers can enter the auto-detect branch before the cache is set, emitting redundant failing CHUNKS SearchService.Search calls before the DOCUMENTS retries.

Environment

  • google-adk: 2.2.0
  • google-cloud-discoveryengine: 0.20.0
  • google-api-core: 2.31.0
  • google-auth: 2.53.0
  • grpcio: 1.81.1
  • Python: 3.14.4
  • OS: macOS arm64
  • Relevant source: src/google/adk/tools/discovery_engine_search_tool.py

Evidence

  • A fresh gh search of open and closed issues/PRs found no clear duplicate.
  • Current source still appears to try SearchResultMode.CHUNKS before SearchResultMode.DOCUMENTS when search_result_mode=None, then mutates self._search_result_mode to DOCUMENTS after the structured-datastore fallback. I did not find an obvious single-flight lock around that transition.
  • The evidence run used a tiny temporary structured JSON datastore. Project, account, datastore, and document identifiers are intentionally omitted here.
  • Total actual SearchService.Search calls in the evidence run: 4092 / cap 5000.
  • 429 count: 0.
  • Max rolling 60s Search starts: 120 / effective cap 120.
  • Direct explicit DOCUMENTS control passed.
  • Direct no-content-search-spec control passed.
  • ADK explicit DOCUMENTS control: 1000 wrapper calls, 1000 DOCUMENTS searches, 0 CHUNKS.
  • ADK auto concurrent cold start: 1000 wrapper calls, 2000 low-level searches.
  • ADK auto CHUNKS calls: 1000.
  • ADK auto DOCUMENTS calls: 1000.
  • Redundant CHUNKS total above a single-flight expectation: 950.
  • Auto calls per wrapper: 2.0x.
  • Explicit DOCUMENTS calls per wrapper: 1.0x.
  • Cleanup verified temporary datastore deletion by 404 NOT_FOUND.

User Impact

This can multiply SearchService.Search calls and therefore increase quota usage, latency, and potentially billable request volume.

Expected Behavior

For a structured datastore, concurrent first use of a fresh shared DiscoveryEngineSearchTool(search_result_mode=None) should single-flight auto-detection. At most one caller should need to probe CHUNKS and learn the DOCUMENTS fallback for that tool instance; concurrent callers should reuse the detected result mode.

Possible fixes:

  • Guard auto-detection with a per-instance single-flight lock.
  • Default to DOCUMENTS for structured datastores when that can be known without probing.
  • Document that structured datastore users should set search_result_mode=SearchResultMode.DOCUMENTS to avoid extra calls.

Minimal Repro Shape

This can be reproduced without live Discovery Engine calls:

  1. Instantiate one fresh shared DiscoveryEngineSearchTool(search_result_mode=None).

  2. Replace _discovery_engine_client with a fake client.

  3. Have the fake client sleep briefly on CHUNKS requests, then raise the same structured-datastore InvalidArgument message:

    content_search_spec.search_result_mode must be set to
    SearchRequest.ContentSearchSpec.SearchResultMode.DOCUMENTS
    when the engine contains structured data store.
    
  4. Have DOCUMENTS requests return one fake document result.

  5. Release N worker threads through a threading.Barrier.

Current behavior emits more than one CHUNKS probe for a single fresh shared tool instance. Fixed behavior should emit at most one.

Control: explicit SearchResultMode.DOCUMENTS emits exactly one DOCUMENTS call per wrapper call and zero CHUNKS calls.

Scope of Run

The evidence run used only discoveryengine.googleapis.com SearchService.Search against a tiny temporary structured JSON datastore. It did not use Gemini, an ADK agent loop, VertexAiSearchTool model calls, process_llm_request, AnswerQuery, SummarySpec, Search Summary, Grounded Generation, Ranking API, Cloud Storage, BigQuery, Document AI, Compute, Cloud Run, GKE, custom Vertex endpoints, OCR, or layout parsing.

No project IDs, billing IDs, local filesystem paths, datastore names, document names, or user-specific identifiers are included in this report.

Metadata

Metadata

Assignees

Labels

tools[Component] This issue is related to tools

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions