Add docs

near · Dec 11, 2024 · 669d16f · 669d16f
1 parent 86c6157
commit 669d16f
Showing 1 changed file with 104 additions and 0 deletions.
diff --git a/docs/practices/workflows/benchmarking_synthetic_workloads.md b/docs/practices/workflows/benchmarking_synthetic_workloads.md
@@ -0,0 +1,104 @@
+# Benchmarking synthetic workloads
+
+Benchmarking a synthetic workloads starts a new network with empty state. Then state is created and afterwards transactions involving that state are generated. For example, the native token transfer workload creates `n` accounts with NEAR balance and then generates transactions to transfer the native token between accounts.
+
+This approach has the following benefits:
+
+- Relatively simple and quick setup, as there is no state from real work networks involved.
+- Fine grained control over traffic intensity.
+- Enabling the comparison of `neard` performance at different points in time or with different features.
+- Might expose performance bottlenecks.
+
+The main drawbacks of synthetic benchmarks are:
+
+- Drawing conclusions is limited as real world traffic is not homogenous.
+- Calibrating traffic generation parameters can be cumbersome.
+
+The tooling for synthetic benchmarks is available in [`benchmarks/bm-synth`](~/benchmarks/bm-synth).
+
+## Common parameters
+
+The following parameters are common to multiple tasks:
+
+### `rpc-url`
+
+The RPC endpoint to which transactions are sent.
+
+Synthetic benchmarking may create thousands of transactions per second, which may hit network limitations if the RPC is located on a separate machine. In particular sending transactions to nodes running on GCP requires care as it can cause temporary IP address bans. For that scenario it is recommended to run a separate traffic generation vm located in the same GCP zone as the RPC node and send transactions to its `internal IP`.
+
+### `interval-duration-micros`
+
+Controls the rate at which transactions are sent. Assuming your hardware is able to send a request at every interval tick, the number of transactions sent per second equals `1_000_000 / interval-duration-micros`. The rate might be slowed down if `channel-buffer-size` becomes a bottleneck.
+
+### `channel-buffer-size`
+
+Before an RPC request is sent, the tooling awaits capacity to send into a buffered channel. Thereby the number of outstanding RPC requests is limited by `channel-buffer-size`. This can slow down the rate at which transactions are sent in case the node is congested. To disable that behavior, set `channel-buffer-size` to a large value, e.g. the total number of transactions to be sent.
+
+## Workflows
+
+The tooling's [`justfile`](~/benchmarks/synth-bm/justfile) contains recipes for the most relevant workflows.
+
+### Create sub accounts
+
+Creating the state for synthetic benchmarks usually starts with creating accounts. We create sub accounts for the account specified by `--signer-key-path`. This avoids dealing with the registrar, which would be required for creating top level accounts. To view all options, run:
+
+```command
+cargo run --release -- create-sub-accounts --help
+```
+
+### Benchmark native token transfers
+
+Generates a native token transfer workload involving the accounts provided in `--user-dada-dir`. Transaction are generated by iterating through these accounts and sending native token to a randomly chosen receiver from the same set of accounts. To view all options, run:
+
+```command
+cargo run --release -- benchmark-native-transfers --help
+```
+
+Automatic calculation of transactions per second (TPS) when RPC requests are sent with `wait_until: NONE` is coming up shortly. In the meantime, they can be calculated manually by querying the `near_transaction_processed_successfully_total` metric, e.g. with:
+
+```command
+http localhost:3030/metrics | grep transaction_processed
+```
+
+## Network setup and `neard` configuration
+
+Details of bringing up and configuring a network are out of scope for this document. Instead we just give a brief overview of the setup regularly used to benchmark TPS of common workloads in a single-node with a single-shard setup.
+
+### Build `neard`
+
+Chose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a `--release` build to measure TPS.
+
+### Initialize the network
+
+```command
+./neard --home .near init --chain-id localnet
+```
+
+### Enable memtrie
+
+The configuration generated by above command does not enable memtrie. However, most benchmarks should run against a node with memtrie enabled, which can be achieved by setting the following in `.near/config.json`:
+
+```
+"load_mem_tries_for_tracked_shards": true
+```
+
+### Un-limit configuration
+
+Following these steps so far creates a config that will throttle throughput due to various factors related to state witness size, gas/compute limits, and congestion control. In case you want to benchmark a node that fully utilizes its hardware, you can do the following modifications to effectively run with unlimited configuration:
+
+```
+# Modifications in .near/genesis.json
+
+"chain_id": "benchmarknet"
+"gas_limit": 20000000000000000 # increase default by x20
+
+# Modifications in .near/config.json
+"view_client_threads": 8 # increase default by x2
+"load_mem_tries_for_tracked_shards": true
+"produce_chunk_add_transactions_time_limit": {
+  "secs": 0,
+  "nanos": 800000000 # increase default by x4
+}
+```
+
+Note that as `nearcore` evolves, these steps and `BENCHMARKNET` adjustments might need to be update to achieve the effect of unlimiting configuration.