Improve --text search timeouts #71

jessopb · 2022-08-12T20:47:48Z

For example,
./lbrynet claim search --text="(\"silver\" + bitten)"
finally returns after the 5th try for me.

The text was updated successfully, but these errors were encountered:

moodyjon · 2022-08-16T18:38:54Z

Modified scripts/test_claim_search.py and tested a few text queries against spvNN.lbry.com. The hubs that are responsive to connection are usually replying within the 10s timeout. But I did get one close call (9.9s), and one timeout.

(Omitting the non-responsive spv11,12,13,14,15)

(lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py
{
    "server": "('spv16.lbry.com', 50001)",
    "time": 0.7056792500000002,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}
{
    "server": "('spv17.lbry.com', 50001)",
    "time": 0.20056029199999958,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}
{
    "server": "('spv18.lbry.com', 50001)",
    "time": 0.817374,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}
{
    "server": "('spv19.lbry.com', 50001)",
    "time": 0.3833993339999999,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}

Second query:

(lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py
{
    "server": "('spv16.lbry.com', 50001)",
    "time": 1.4023516669999996,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}
{
    "server": "('spv18.lbry.com', 50001)",
    "time": 1.0191300420000005,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}
{
    "server": "('spv19.lbry.com', 50001)",
    "time": 0.3879024590000011,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}
{
    "server": "('spv17.lbry.com', 50001)",
    "time": 9.945044459000002,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}

Third query:

(lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py
{
    "server": "('spv16.lbry.com', 50001)",
    "time": 3.2578703750000004,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": 0,
    "total": 1000,
    "blocked_total": 0
}
{
    "server": "('spv19.lbry.com', 50001)",
    "time": 0.6107135420000009,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": 0,
    "total": 1000,
    "blocked_total": 0
}
{
    "server": "('spv18.lbry.com', 50001)",
    "time": 1.7782999579999998,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": 0,
    "total": 1000,
    "blocked_total": 0
}
Wallet server (spv17.lbry.com:50001) returned an error. Code: -32000 Message: query timed out
{
    "server": "('spv17.lbry.com', 50001)",
    "time": 11.295952083,
    "error": "(-32000, 'query timed out')",
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": null,
    "total": null,
    "blocked_total": null
}

moodyjon · 2022-08-16T19:14:59Z

Perhaps the thing to be done here is spread the load around more. It looks like the SDK selects the one with the lowest latency SPVPong response. This could be misleading, as it doesn't account for elastic search latency and other things that might go into servicing hub RPCs.

Also, the hub performance could change with day of week, or time of day. I don't see a provision to react to deteriorated performance, or claim_search timeout by choosing a different hub.

moodyjon · 2022-08-17T17:54:18Z

Other ideas from ES documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/8.3/tune-for-search-speed.html
https://www.elastic.co/guide/en/elasticsearch/reference/8.3/tune-for-search-speed.html#search-as-few-fields-as-possible

Hard to say what effect this would have. But 6 fields are being searched currently:

hub/hub/common.py

Line 907 in 34c5ab2

    
           "claim_name^4", "channel_name^8", "title^1", "description^.5", "author^1", "tags^.5"

moodyjon · 2022-08-17T18:28:29Z

Another observation... The --query_timeout_ms (10s default) is passed into constructor AsyncElasticSearch()

hub/hub/herald/search.py

Line 62 in 35483fa

self.search_client = AsyncElasticsearch(hosts, timeout=self.search_timeout)

However, there are API-level timeout params accepted for individual calls to ES:

https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/config.html#_api_and_server_timeouts

API-level timeout for search:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html#search-timeout

The allow_partial_search_results option (default true) means the search should never fail on hitting the (API-level) timeout, but return whatever it has available after the time-budget is exhausted:

https://www.elastic.co/guide/en/elasticsearch/reference/8.3/search-search.html#search-search-api-query-params

moodyjon · 2022-08-17T18:33:33Z

Here's the search invocation (no timeout=X):

hub/hub/herald/search.py

Line 208 in 34c5ab2

search_hits = deque((await self.search_client.search(

moodyjon self-assigned this Aug 15, 2022

moodyjon mentioned this issue Aug 17, 2022

Specify API-level search timeout to reduce hard timeout errors. #81

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve --text search timeouts #71

Improve --text search timeouts #71

jessopb commented Aug 12, 2022

moodyjon commented Aug 16, 2022

moodyjon commented Aug 16, 2022 •

edited

Loading

moodyjon commented Aug 17, 2022

moodyjon commented Aug 17, 2022

moodyjon commented Aug 17, 2022

Improve --text search timeouts #71

Improve --text search timeouts #71

Comments

jessopb commented Aug 12, 2022

moodyjon commented Aug 16, 2022

moodyjon commented Aug 16, 2022 • edited Loading

moodyjon commented Aug 17, 2022

moodyjon commented Aug 17, 2022

moodyjon commented Aug 17, 2022

moodyjon commented Aug 16, 2022 •

edited

Loading