Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/_base_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ jobs:
workflow-name: base_test

base_tests:
runs-on: [self-hosted, GPU-h20-1Cards]
runs-on: [self-hosted, GPU-h20-New-Driver]
needs: check_bypass
if: ${{ inputs.FASTDEPLOY_WHEEL_URL != '' && needs.check_bypass.outputs.can-skip != 'true' }}
timeout-minutes: 60
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/_build_linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ jobs:
workflow-name: build_gpu

fd-build:
runs-on: [self-hosted, GPU-Build]
runs-on: [self-hosted, GPU-Build-New-Driver]
needs: check_bypass
if: ${{ needs.check_bypass.outputs.can-skip != 'true' }}
timeout-minutes: 360
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/_logprob_test_linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ jobs:
workflow-name: logprob_test

run_tests_logprob:
runs-on: [self-hosted, GPU-h20-1Cards]
runs-on: [self-hosted, GPU-h20-New-Driver]
needs: check_bypass
if: ${{ inputs.FASTDEPLOY_WHEEL_URL != '' && needs.check_bypass.outputs.can-skip != 'true' }}
timeout-minutes: 60
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/_pre_ce_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ jobs:
workflow-name: pre_ce_test

run_ce_cases:
runs-on: [self-hosted, PRE_CE_RUN_2Card]
runs-on: [self-hosted, GPU-h20-New-Driver]

This comment was marked as outdated.

needs: check_bypass
if: ${{ inputs.FASTDEPLOY_WHEEL_URL != '' && needs.check_bypass.outputs.can-skip != 'true' }}
timeout-minutes: 60
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/_stable_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ jobs:
workflow-name: stable_test

stable_tests:
runs-on: [self-hosted, GPU-h20-2Cards]
runs-on: [self-hosted, GPU-h20-New-Driver]

This comment was marked as outdated.

needs: check_bypass
if: ${{ inputs.FASTDEPLOY_WHEEL_URL != '' && needs.check_bypass.outputs.can-skip != 'true' }}
timeout-minutes: 60
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/_unit_test_coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ jobs:
workflow-name: coverage

run_tests_with_coverage:
runs-on: [self-hosted, GPU-h1z1-2Cards]
runs-on: [self-hosted, GPU-h20-New-Driver]

This comment was marked as outdated.

timeout-minutes: 105
needs: check_cov_skip
if: ${{ inputs.FASTDEPLOY_WHEEL_URL != '' && needs.check_cov_skip.outputs.can-skip != 'true' }}
Expand Down
4 changes: 3 additions & 1 deletion tests/layers/test_plas_attention.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,9 @@ def setUp(self):
[self.tokens + self.attn_block_m, self.num_kv_heads, self.head_dim],
dtype="bfloat16",
)
self.rotary_embs = paddle.ones([2, self.seq_len, self.head_dim // 2], dtype="float32")
rotary_cos = paddle.ones([1, self.plas_max_seq_length, self.head_dim // 2], dtype="float32")
rotary_sin = paddle.zeros([1, self.plas_max_seq_length, self.head_dim // 2], dtype="float32")
self.rotary_embs = paddle.concat([rotary_cos, rotary_sin], axis=0)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

疑问rotary_embs 第二维从 seq_len 改为 plas_max_seq_length

原始初始化使用 seq_len 作为时间维度,修改后改用 plas_max_seq_length。两者含义不同:seq_len 是当前测试用的序列长度,plas_max_seq_length 是 PLAS Attention 允许的最大序列长度。

请确认:

  1. PLAS Attention kernel 是否要求 rotary embedding 必须按 plas_max_seq_length 预分配?
  2. 原来用 seq_len 是否导致了维度不匹配的 Bug(如果是,建议在 PR 描述中注明这是修复)?

如果只是修改为 plas_max_seq_length 使测试通过新驱动,建议在注释中说明。


self.attn_gate_weight = paddle.randn(
[self.num_kv_heads, self.plas_block_size, self.head_dim], dtype="bfloat16"
Expand Down
Loading