Add imbalance factor in test_low_latency #393

JianboDong · 2025-09-04T13:06:07Z

The current test_low_latency script in DeepEP assumes a uniform token distribution across experts (ranks). In real workloads, however, token-to-expert routing is often skewed. To demonstrate this, we recorded token distributions from several layers of the dpsk v3 model taking the MMLU Pro dataset as input. As the figures show, the per rank token counts in individual layers (red line) vary substantially.

While load balancing mechanisms like EPLB can mitigate skew, they cannot perfectly predict future distributions, so meaningful imbalance typically persists. To evaluate DeepEP under these conditions, we propose to extend the test_low_latency script to include imbalanced loads.

Imbalanced Distribution Modeling --distribution
We tried several statistical functions to fit the real distribution. As the following figures demonstrate, the log-normal distribution consistently provides the best approximation of the observed token imbalance(lowest Sum of Squared Errors, SSE). In addition, we also include alternative options, such as gamma and power-law distributions.

Degree of Imbalance: --imbalance-factors
A new command-line argument, --imbalance-factors, is introduced to control the degree of imbalance. This factor is intuitively defined as max_tokens_per_rank / average_tokens_per_rank. This allows users to easily simulate various levels of load skew.
Test Flow:
The benchmark runs a default test (with uniformed distribution) as before. It then proceeds to run several rounds of additional tests with different imbalance factors as specified.
Output Format:
The test results are summarized in tables, with each row shows the results with corresponding imbalance factor (the results with default uniform setting are shown in the first row). Each row includes the target/real imbalance factor, followed by the key metrics such as averaged, max, and min values, across all the ranks. Unlike the per-rank output in original scripts, it provides a more concise, holistic view of the system's overall performance under skewed loads.

Huoyuan100861 · 2025-09-14T14:02:57Z

Very useful for optimizing imbalance research.

JianboDong added 2 commits September 4, 2025 19:36

Update test_low_latency.py

bcc90b1

Update utils.py

212ea44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add imbalance factor in test_low_latency #393

Add imbalance factor in test_low_latency #393

Uh oh!

JianboDong commented Sep 4, 2025

Uh oh!

Huoyuan100861 commented Sep 14, 2025

Uh oh!

Uh oh!

Add imbalance factor in test_low_latency #393

Are you sure you want to change the base?

Add imbalance factor in test_low_latency #393

Uh oh!

Conversation

JianboDong commented Sep 4, 2025

Uh oh!

Huoyuan100861 commented Sep 14, 2025

Uh oh!

Uh oh!