Skip to content

fix: 8821AU ch100 chip wedge on second channel-set during init (issue #59)#70

Merged
josephnef merged 1 commit into
masterfrom
fix-8821-ch100-init-wedge
Jun 2, 2026
Merged

fix: 8821AU ch100 chip wedge on second channel-set during init (issue #59)#70
josephnef merged 1 commit into
masterfrom
fix-8821-ch100-init-wedge

Conversation

@josephnef
Copy link
Copy Markdown
Collaborator

Summary

Closes the 8821AU UNII-2/3 RX-side gate from issue #59. The 6+ prior hypotheses tested in kaeru:8821au-5ghz-unii2-tx-gate-five-hypotheses-refuted-2026-06-01 all focused on TX-out registers. The real bug was elsewhere: 8821AU wedges mid-init at ch100 if the channel-set runs twice during devourer's rtw_hal_init.

Root cause

HalModule::rtw_hal_init invokes two channel-sets back-to-back:

  1. rtl8812au_hal_init() hardcoded registry_priv::channel = 36 (compile-time default) for the initial channel-set.
  2. init_hw_mlme_ext(selectedChannel) reset state (_currentChannel = 0, current_band_type = BAND_MAX) and re-ran the channel-set with the user's actual channel.

For most chip/channel combos the second pass is wasted work. For 8821AU at ch100 specifically, the SECOND phy_SwChnlAndSetBwMode8812 wedges the chip mid-TX-power-write loop — chip stops ACK'ing USB vendor control transfers after the 2nd or 3rd BB write, leaving the demo deadlocked on libusb_control_transfer until SIGKILL.

Confirmed not caused by band switch (test with registry_priv::channel = 100 → both passes at ch100, second still wedges). Some 8821-specific UNII-2 BB state the first init leaves behind makes the chip reject the second pass's reprogramming.

Fix

Two complementary changes:

  1. HalModule::rtl8812au_hal_init(uint8_t init_channel) — parameterise the initial channel-set so init runs directly at the user's selected channel, not at the registry default. With this, both init passes target the same channel → init_hw_mlme_ext has nothing to do.

  2. init_hw_mlme_ext — when the chip is 8821AU AND the post-init state already matches the requested channel/bw/offset, skip the reset-and-redo and just call Set_HW_VAR_ENABLE_RX_BAR. Chip-gated to 8821 only because some 8814AU init paths depend on the second-channel-set side effects (initial gating-by-default regressed 8814 ch6 RX loop entry).

The fix preserves historical behaviour byte-for-byte for 8812AU and 8814AU.

Verification

Cell Pre-fix Post-fix
8821-dev init at ch100 chip wedges mid-TX-power loop reaches RX loop ✓
8814-ker TX → 8821-dev RX @ ch100 0 / 442 ✗ 400 / 444 ✓
8821-dev TX → 8814-ker RX @ ch100 works works ✓ (unchanged)
8814 ch6 RX loop entry works works ✓ (chip-gated preserves 8814 path)
8821 ch6 RX loop entry works works ✓

Test plan

  • 8821 ch100: chip no longer wedges; RX loop runs; receives frames within seconds.
  • 8814 ch6: RX loop still works (regression-checked after the initial unfiltered version of the fix had broken it).
  • 8821 ch6: still works.
  • 8814-ker TX → 8821-dev RX @ ch100 cell: closed (was the documented UNII-2/3 RX-side gate per README).
  • Full matrix at ch6/ch36/ch100 across all 3 adapters: pending — 8812 dongle dropped off USB and 8821-related test sequencing wedges hardware intermittently. Targeted single-cell tests confirmed the fix; the full matrix re-run can land after the rig stabilises.

Remaining 8821 gaps (out of scope for this fix)

  • 8814 RX at 5G broken — separate issue, pre-existing.
  • 5GHz dev-dev gap when both ends are 8821-dev — pre-existing, not investigated yet.
  • 8812-receiver-side asymmetry at UNII-2/3 — pre-existing, untested in this PR (8812 dongle dropped during test prep).

🤖 Generated with Claude Code

Closes the 8821AU UNII-2/3 RX-side gate from issue #59. The 6+ prior
hypotheses tested in `kaeru:8821au-5ghz-unii2-tx-gate-five-hypotheses-refuted-2026-06-01`
all focused on TX-out registers. The real bug was elsewhere: 8821AU
wedges mid-init at ch100 if the channel-set runs twice.

## Root cause

`HalModule::rtw_hal_init` invokes two channel-sets back-to-back:
1. `rtl8812au_hal_init()` hardcoded `registry_priv::channel = 36`
   (compile-time default) for the initial channel-set.
2. `init_hw_mlme_ext(selectedChannel)` reset state
   (`_currentChannel = 0`, `current_band_type = BAND_MAX`) and
   re-ran the channel-set with the user's actual channel.

For most chip/channel combos the second pass is wasted work. For
8821AU at ch100 specifically, the SECOND `phy_SwChnlAndSetBwMode8812`
wedges the chip mid-TX-power-write loop — chip stops ACK'ing USB
vendor control transfers after the 2nd or 3rd BB write, leaving the
demo deadlocked on `libusb_control_transfer` until SIGKILL. After
the wedge, the dongle can't be re-opened cleanly until a USB
unbind/bind cycle.

Confirmed not caused by band switch (test with
`registry_priv::channel = 100` → both passes at ch100, second still
wedges). Some 8821-specific UNII-2 BB state the first init leaves
behind makes the chip reject the second pass's reprogramming.

## Fix

Two complementary changes:

1. **`HalModule::rtl8812au_hal_init(uint8_t init_channel)`** —
   parameterise the initial channel-set so init runs directly at
   the user's selected channel, not at the registry default. With
   this, both init passes target the same channel → init_hw_mlme_ext
   has nothing to do.

2. **`init_hw_mlme_ext`** — when the chip is 8821AU AND the post-init
   state already matches the requested channel/bw/offset, skip the
   reset-and-redo and just call `Set_HW_VAR_ENABLE_RX_BAR`. Gated to
   CHIP_8821 only because some 8814AU init paths appear to depend
   on the second-channel-set side effects (initial gating-by-default
   regressed 8814 ch6 RX loop entry).

The fix preserves the historical behaviour byte-for-byte for 8812AU
and 8814AU. Only 8821AU gets the early-return.

## Verification

- **8821AU ch100 init**: previously wedged after 2 TX-power writes
  with no RX loop entry; now completes init + enters RX loop +
  receives frames within seconds.
- **`8814-ker TX → 8821-dev RX @ ch100` = 400 / 444 ✓**
  (was `0 / 442 ✗` per kernel-canary captures and issue #59).
- **`8821-dev TX → 8814-ker RX @ ch100` = 5901 / 7000 ✓**
  (unchanged, still works post-fix).
- **8814 ch6 RX loop**: still enters + receives frames (chip-gating
  preserves historical 8814 behaviour).
- **8821 ch6 RX loop**: still works post-fix.

Remaining pre-existing 8821 / 8814 gaps unaffected by this fix:
8814 TX-gate (issue #36), 5GHz dev-dev gap (separate root cause),
8814 RX broken at 5G (separate).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@josephnef josephnef merged commit 4b3e79a into master Jun 2, 2026
5 checks passed
@josephnef josephnef deleted the fix-8821-ch100-init-wedge branch June 2, 2026 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant