You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running convergence test case in Testground with more than 48 nodes, the execution fails. There are times during the execution that nodes are not able to dial another peer.
Testground command:testground run single --plan=casm --testcase="pex-convergence" --runner=local:docker --builder=docker:go --instances=48
The main idea is that the network and containers get overloaded. Convergence test case makes use of redis barriers in every iteration, in order to syncrhonize nodes. However, redis barriers are known to have a big overhead.
The text was updated successfully, but these errors were encountered:
Another test case was created in order to check whether name resolution problem arise because of overloading the system. The test simply runs N nodes, and all except one node, call a Redis barrier. Preliminary tests show that if 200-500 nodes are used, Testground raises name resolution errors.
Error output: {"ts":1636108935146165512,"msg":"failed while getting barriers; iteration skipped","group_id":"single","run_id":"c62ggeeqmkrnfmls1p40","process":"barriers","error":"dial tcp: lookup testground-redis: Temporary failure in name resolution"}
and also
failed to send batch to InfluxDB; attempt 1; err: Post "http://testground-influxdb:8086/write?consistency=&db=testground&precision=ns&rp=": dial tcp: lookup testground-influxdb: Temporary failure in name resolution
Command to run new test case is:testground run single --plan=casm --testcase="dns-test" --runner=local:docker --builder=docker:go --instances=500 .
Description
When running convergence test case in Testground with more than 48 nodes, the execution fails. There are times during the execution that nodes are not able to dial another peer.
Testground command:
testground run single --plan=casm --testcase="pex-convergence" --runner=local:docker --builder=docker:go --instances=48
Error output is:
Ideas
The main idea is that the network and containers get overloaded. Convergence test case makes use of redis barriers in every iteration, in order to syncrhonize nodes. However, redis barriers are known to have a big overhead.
The text was updated successfully, but these errors were encountered: