Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve --text search timeouts #71

Open
jessopb opened this issue Aug 12, 2022 · 5 comments
Open

Improve --text search timeouts #71

jessopb opened this issue Aug 12, 2022 · 5 comments
Assignees

Comments

@jessopb
Copy link
Member

jessopb commented Aug 12, 2022

For example,
./lbrynet claim search --text="(\"silver\" + bitten)"
finally returns after the 5th try for me.

@moodyjon moodyjon self-assigned this Aug 15, 2022
@moodyjon
Copy link
Contributor

Modified scripts/test_claim_search.py and tested a few text queries against spvNN.lbry.com. The hubs that are responsive to connection are usually replying within the 10s timeout. But I did get one close call (9.9s), and one timeout.

(Omitting the non-responsive spv11,12,13,14,15)

(lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py
{
    "server": "('spv16.lbry.com', 50001)",
    "time": 0.7056792500000002,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}
{
    "server": "('spv17.lbry.com', 50001)",
    "time": 0.20056029199999958,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}
{
    "server": "('spv18.lbry.com', 50001)",
    "time": 0.817374,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}
{
    "server": "('spv19.lbry.com', 50001)",
    "time": 0.3833993339999999,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}

Second query:

(lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py
{
    "server": "('spv16.lbry.com', 50001)",
    "time": 1.4023516669999996,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}
{
    "server": "('spv18.lbry.com', 50001)",
    "time": 1.0191300420000005,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}
{
    "server": "('spv19.lbry.com', 50001)",
    "time": 0.3879024590000011,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}
{
    "server": "('spv17.lbry.com', 50001)",
    "time": 9.945044459000002,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}

Third query:

(lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py
{
    "server": "('spv16.lbry.com', 50001)",
    "time": 3.2578703750000004,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": 0,
    "total": 1000,
    "blocked_total": 0
}
{
    "server": "('spv19.lbry.com', 50001)",
    "time": 0.6107135420000009,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": 0,
    "total": 1000,
    "blocked_total": 0
}
{
    "server": "('spv18.lbry.com', 50001)",
    "time": 1.7782999579999998,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": 0,
    "total": 1000,
    "blocked_total": 0
}
Wallet server (spv17.lbry.com:50001) returned an error. Code: -32000 Message: query timed out
{
    "server": "('spv17.lbry.com', 50001)",
    "time": 11.295952083,
    "error": "(-32000, 'query timed out')",
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": null,
    "total": null,
    "blocked_total": null
}

@moodyjon
Copy link
Contributor

moodyjon commented Aug 16, 2022

Perhaps the thing to be done here is spread the load around more. It looks like the SDK selects the one with the lowest latency SPVPong response. This could be misleading, as it doesn't account for elastic search latency and other things that might go into servicing hub RPCs.

Also, the hub performance could change with day of week, or time of day. I don't see a provision to react to deteriorated performance, or claim_search timeout by choosing a different hub.

@moodyjon
Copy link
Contributor

Other ideas from ES documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/8.3/tune-for-search-speed.html
https://www.elastic.co/guide/en/elasticsearch/reference/8.3/tune-for-search-speed.html#search-as-few-fields-as-possible

Hard to say what effect this would have. But 6 fields are being searched currently:

"claim_name^4", "channel_name^8", "title^1", "description^.5", "author^1", "tags^.5"

@moodyjon
Copy link
Contributor

Another observation... The --query_timeout_ms (10s default) is passed into constructor AsyncElasticSearch()

self.search_client = AsyncElasticsearch(hosts, timeout=self.search_timeout)

However, there are API-level timeout params accepted for individual calls to ES:

https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/config.html#_api_and_server_timeouts

API-level timeout for search:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html#search-timeout

The allow_partial_search_results option (default true) means the search should never fail on hitting the (API-level) timeout, but return whatever it has available after the time-budget is exhausted:

https://www.elastic.co/guide/en/elasticsearch/reference/8.3/search-search.html#search-search-api-query-params

@moodyjon
Copy link
Contributor

Here's the search invocation (no timeout=X):

search_hits = deque((await self.search_client.search(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants