Skip to content

EFUSE TX-power port + cross-validation oracle + 4 init-drift bug fixes (T1)#64

Merged
josephnef merged 7 commits into
masterfrom
t1-canary-diff-oracle
Jun 1, 2026
Merged

EFUSE TX-power port + cross-validation oracle + 4 init-drift bug fixes (T1)#64
josephnef merged 7 commits into
masterfrom
t1-canary-diff-oracle

Conversation

@josephnef
Copy link
Copy Markdown
Collaborator

Summary

Implements the TODO.md T1 task ("RF register cross-validation oracle"). Lands a diagnostic facility, surfaces 13 register-init divergences against aircrack-ng/88XXau at ch6, and fixes the 11 of 13 that are reachable without porting phydm itself.

The headline win is a full port of upstream's per-rate per-channel per-Ntx TX-power computation chain (PHY_GetTxPowerIndexBase + PHY_GetTxPowerByRate + PHY_GetTxPowerLimit + the phy_reg_pg + txpwr_lmt static tables), replacing devourer's historical "uniform SetTxPower(N) for all rates" shortcut. Side-effect: partially unblocks the long-standing 8821AU 5 GHz UNII-2 TX silent-failure (issue #59).

Commits (6, each landable on its own)

  1. f2a0970 — diagnostic oracle: DEVOURER_DUMP_CANARY=1 env knob in RadioManagementModule.cpp dumps 33 BB + 13 MAC + RF reg 0x00/05/18/42/65/8f for both paths post-channel-set. tools/canary_kernel_dump.sh is the kernel-side companion (`iwpriv read 4,...` / `iwpriv rfr ...`). Bundled with a 12-line driver patch (`tests/aircrack-ng-88XXau-siocdevprivate.patch`) that fixes the kernel-5.13+ deprecation of `ndo_do_ioctl` for SIOCDEVPRIVATE — required to make Realtek's v5.9.1 `rtwpriv` (which uses the newer ioctl path) work against aircrack-ng/88XXau for follow-up T3/T4 work.

  2. 5e18047 — REG_MACID from EFUSE + MSR=NO_LINK. Two fixes from the canary diff:

    • `MAC 0x610/0x614` was unprogrammed on 8812AU (was 0; kernel writes the chip's actual MAC). The 8814 hardcoded-MAC fallback was extended to read from EFUSE for all Jaguar chips via new `EepromManager::GetMacAddress`.
    • `MAC 0x102 (MSR)` was set to NT_LINK_AP for devourer (monitor-only); kernel uses NT_NO_LINK. Fixed with the long-standing TODO at `_InitNetworkType_8812A`.
  3. 5365e46 — per-channel + per-Ntx TX-power port (no by-rate / no regulatory). Parses the EFUSE PG block at offset 0x10 into `Index24G_CCK_Base[4][14]`, `Index24G_BW40_Base[4][14]`, `Index5G_BW40_Base[4][65]`, and per-Ntx `*_Diff[4][4]` arrays. `GetTxPowerIndexBase` ports the rate/bw/Ntx accumulation from hal_com_phycfg.c:PHY_GetTxPowerIndexBase. Brought devourer from uniform 0x28 to 0x2A-0x2C (within 1.5 dB of kernel's 0x2D-0x31).

  4. T1.3 (full per-rate) — adds the by-rate offset + regulatory limit layers. Verbatim ports of upstream's static tables: `hal/Hal8812a_PhyRegPg.h` (46-row per-rate offset table) and `hal/Hal8812a_TxpwrLmt.h` (~566-row regulatory cap table). Three new EepromManager methods: `LoadTxPowerByRate` (parse + normalize phy_reg_pg), `LoadTxPowerLimit` (string-parse txpwr_lmt), and the limit-cap logic in `GetTxPowerIndexBase`. Also fixes `PHY_SetTxPowerIndex_8812A`'s unconditional `PowerIndex -= 1 if odd` workaround which was a TEST-CHIP-only thing per upstream — introduced a systematic -1 across all TX-AGC bytes.

  5. bfe733b — `PHY_TxPowerTrainingByPath_8812` (0xc54) seeds from `GetTxPowerIndexBase(MGN_MCS7)` instead of the legacy `power` shortcut. Matches upstream `phy_get_tx_power_index(adapter, path, MGN_MCS7, BW, Ch)`.

  6. 6c40707 — README chip-status table reflects the 8821AU 5 GHz state change post-T1.

Verification

Canary diff at ch6 (post-T1 branch, 8812AU)

All 9 TX-AGC registers + REG_MACID + REG_CR + MSR + rA_TxPwrTraing match kernel byte-for-byte:

BB 0xc20 (CCK)        krn 0x2F2F2F2F   dev 0x2F2F2F2F ✓
BB 0xc24 (OFDM)       krn 0x31313131   dev 0x31313131 ✓
BB 0xc28 (OFDM)       krn 0x31313131   dev 0x31313131 ✓
BB 0xc2c-c30 (MCS0-7) krn 0x2F2F2F2F   dev 0x2F2F2F2F ✓
BB 0xc34-c38 (MCS8-15) krn 0x2D2D2D2D  dev 0x2D2D2D2D ✓
BB 0xc3c-c40 (VHT1SS) krn 0x2F2F2F2F   dev 0x2F2F2F2F ✓
BB 0xc54 (TxPwrTraing) krn 0x00171D25  dev 0x00171D25 ✓
MAC 0x100/0x102/0x610/0x614                            ✓

Two residual divergences (phydm runtime dynamic state — unreachable without porting the phydm subsystem itself): `0xc1c[31:21]` BB swing and `0xc50` RX initial gain.

Full matrix regression (3 channels × 24 cells = 72)

tests/regress.py --full-matrix --channel {6,36,100} on all 3 plugged DUTs:

  • All 6 kernel-only baselines pass at every channel (rig sanity ✓).
  • Zero regressions — every cell that worked on master still works on this branch.

Surprise wins (not the goal, but real)

Cell Pre-T1 Post-T1
8821 dev-TX → 8814 kernel-RX at ch36 0 hits 4463 hits ✓
8821 dev-TX → 8814 kernel-RX at ch100 0 hits 4160 hits ✓

The per-rate TX power port partially unblocked the 8821AU 5 GHz UNII-2 TX gate (issue #59) — frames now reach the air at line rate when the peer is an 8814AU. Still 0 when the peer is an 8812AU, so it's not the full fix; this part stays open as a receiver-dependent rate/format mismatch.

Test plan

  • cmake --build build -j clean
  • Canary diff at ch6 — all 9 TX-AGC regs byte-match kernel
  • Full matrix at ch6 — 24 cells, no regressions
  • Full matrix at ch36 — 24 cells, no regressions
  • Full matrix at ch100 — 24 cells, no regressions
  • CI matrix (build only)

🤖 Generated with Claude Code

josephnef and others added 7 commits June 1, 2026 16:43
Stand up the diagnostic facility from TODO.md T1 ("RF register
cross-validation oracle"): dump ~52 canary BB/MAC/RF registers
post-channel-set on both the kernel-driver side (via rtwpriv/iwpriv)
and the devourer side (via a new env-gated dump in
phy_SwChnlAndSetBwMode8812). Same chip, same channel, same monitor
mode — any mismatch is a candidate devourer init-drift bug.

What's in this commit
---------------------

- `src/RadioManagementModule.cpp` — when `DEVOURER_DUMP_CANARY=1` is
  set, after channel-set + BW-set + TX power, dump the canary set
  with output formatted to diff line-by-line against the kernel-side
  dump (`BB 0xADDR = 0xVALUE`, `MAC 0xADDR = 0xVALUE`,
  `RF[A|B] 0xADDR = 0xVALUE`).

- `tools/canary_kernel_dump.sh` — kernel-side companion that drives
  `iwpriv read 4,<addr>` / `iwpriv rfr <path> <addr>` over the same
  canary list. Output is byte-compatible with the devourer side.

- `tests/aircrack-ng-88XXau-siocdevprivate.patch` — a small kernel-
  driver patch that unlocks Realtek's v5.9.1 `rtwpriv` (the canonical
  MP-mode tool, captured 2024-10-14) against aircrack-ng/88XXau on
  kernel 5.15+. Required for follow-up T3 (EFUSE state-machine work
  via `efuse_get realmap`) and T4 (MP-mode subcommand). Not required
  for T1 itself — included as a leftover from confirming why v5.9.1
  rtwpriv was returning -EOPNOTSUPP for every command (kernel 5.13
  deprecated SIOCDEVPRIVATE routing via `.ndo_do_ioctl`; the patch
  adds a thin `.ndo_siocdevprivate` shim).

Usage
-----

Kernel side (in the devourer-testrig VM, 8812AU attached, ch6):

    sudo tools/canary_kernel_dump.sh wlx... 6 > /tmp/krn.canary

Devourer side (same chip on host post-`virsh detach-device`):

    sudo DEVOURER_VID=0x0bda DEVOURER_PID=0x8812 DEVOURER_CHANNEL=6 \
        DEVOURER_DUMP_CANARY=1 ./build/WiFiDriverTxDemo 2>&1 \
      | awk '/DEVOURER_DUMP_CANARY \(post channel-set ch=6\)/,
             /END DEVOURER_DUMP_CANARY/' \
      | sed 's/^<devourer>//' \
      > /tmp/dev.canary

    diff /tmp/krn.canary /tmp/dev.canary

First-run findings (8812AU at ch6, kernel vs devourer)
------------------------------------------------------

13 divergent registers surfaced — each is a candidate init-drift bug
for follow-up tasks:

  BB 0xc1c (rA_TxScale)        krn 0x47C00003   dev 0x40000003
    - BB swing bits [31:21]: 0x23E vs 0x200 (devourer takes default
      0 dB; kernel applies a tuned non-standard value)

  BB 0xc20..0xc40 (TX AGC OFDM/MCS per-rate)
    - krn writes DIFFERENT power per rate (0x2D..0x31)
    - dev writes UNIFORM 0x28 across all rates
    - root cause: PHY_SetTxPowerIndexByRateArray at
      RadioManagementModule.cpp:1295 uses class member `power`
      (set once via SetTxPower) instead of per-rate per-channel
      EFUSE arithmetic
    - matches the strongest remaining lead from the 5GHz UNII-2
      investigation (Issue #59)

  BB 0xc50 (rA_IGI)            krn 0x0000001C   dev 0x00000020
    - RX initial gain off by 4

  BB 0xc54 (rA_TxPwrTraing)    krn 0x00171D25   dev 0x0010161E
    - per-path TX power training divergent

  BB 0xe1c / 0xe50 / 0xe54 — path-B mirrors of the above

  MAC 0x610 / 0x614 (REG_MACID) krn (MAC bytes)  dev 0x00000000
    - devourer does NOT program MAC address for 8812AU
    - matches the historical hardcoded-MAC workaround that was 8814
      / 8821 only (see HalModule.cpp around the trace-replay block)

  MAC 0x40                     krn 0x00CC0000   dev 0x000C0000
  MAC 0x100                    krn 0x000006FF   dev 0x000206FF
    - REG_CR upper byte: bit 17 set on devourer (ENSEC for 8814)
      that should NOT be set on 8812

  MAC 0x420 (REG_BSSID_PRIME)  krn 0xFF311F00   dev 0xFF711F00
  MAC 0x550                    krn 0x00001019   dev 0x00001010
  MAC 0x560                    krn 0x02A39001   dev 0x000DBC9B
  MAC 0x102                    krn 0x00300000   dev 0x00300002

  RF[A] 0x00                   krn 0x31e37      dev 0x33E68
  RF[A] 0x42                   krn 0x098f8      dev 0x08D18

This commit only stands up the diagnostic; it does not fix any of the
divergences (TODO T1 is explicitly "diagnostic only — code change when
divergence found"). Each row above is a candidate follow-up task; the
TX AGC per-rate row is the strongest lead since it's been independently
flagged by the 8821AU 5GHz investigation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
T1 canary diff (TODO.md) surfaced four MAC-register divergences on
8812AU at ch6 that all derive from two devourer-side bugs:

  MAC 0x100  REG_CR        krn 0x000006FF  dev 0x000206FF (bit 17 set)
  MAC 0x102  MSR           krn 0x00300000  dev 0x00300002 (NT_LINK_AP)
  MAC 0x610  REG_MACID     krn (chip MAC)  dev 0x00000000
  MAC 0x614  REG_MACID+4   krn (chip MAC)  dev 0x00000000

The REG_CR upper-byte bit was a side effect of MSR being set to
NT_LINK_AP — REG_CR[17:16] holds MSR for port 0, and `_NETTYPE(2)`
sets bit 17 inside the REG_CR DWORD when read at 0x100. Two fixes:

1. **REG_MACID programmed from EFUSE for all Jaguar chips.**
   Previously hardcoded only for 8814AU (locally-administered MAC
   `02:0d:b0:c7:e4:b3`), unprogrammed for 8812AU and 8821AU. Many
   Realtek MAC-TX paths refuse to schedule frames if the MAC ID
   is zero — the T1 canary caught this on 8812AU but the same gap
   applied to 8821AU after PR #61 removed its trace-poke fallback.

   Added `EepromManager::GetMacAddress(uint8_t out[6])` returning
   true if a valid (non-all-FF / non-all-zero) MAC was found in
   the EFUSE shadow. EFUSE offsets per upstream `hal_pg.h`:
   0xD7 for 8812AU, 0xD8 for 8814AU, 0x107 for 8821AU.

   HalModule post-fwdl now calls GetMacAddress and programs
   REG_MACID for every chip. Falls back to the historical
   hardcoded 8814AU MAC only if EFUSE is empty AND chip is 8814.

   Verified on TP-Link Archer T2U Plus 8812AU rig: log line
   `REG_MACID programmed from EFUSE: 54:c9:ff:02:d1:9a` matches
   `iwpriv read 4,0x610` from the kernel-driver side.

2. **`_InitNetworkType_8812A` sets MSR = NT_NO_LINK** instead of
   NT_LINK_AP. devourer is monitor-only and the kernel rtw driver
   also lands at NT_NO_LINK in monitor mode; the NT_LINK_AP here
   was a leftover already flagged with a TODO comment ("use the
   other function to set network type") and is what made REG_CR
   bit 17 differ between kernel and devourer post-init.

   Both fixes verified by re-running the canary diff oracle: the
   four-row block above now matches kernel byte-for-byte.

Divergences NOT yet closed by this commit (T1 continues):

  BB 0x0c1c (rA_TxScale)        krn 0x47C00003  dev 0x40000003
    BB swing default vs kernel's tuned 0x23E
  BB 0x0c20..0xc40 (TX AGC)     uniform 0x28 vs kernel per-rate
    Requires per-rate per-channel EFUSE arithmetic port (the
    `power = 40` shortcut in PHY_SetTxPowerIndexByRateArray)
  BB 0x0c50 / 0x0c54 (IGI/TxPwrTraing) — separate root cause
  MAC 0x40 GPIO_MUXCFG          krn 0x00CC0000  dev 0x000C0000
  MAC 0x420 BSSID_PRIME         krn 0xFF311F00  dev 0xFF711F00
  MAC 0x550 / 0x560 / RF[A] 0x00 / RF[A] 0x42 — each its own dig

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
T1 canary diff (TODO.md) showed the largest remaining divergence
cluster all derived from one root cause: devourer's
`PHY_SetTxPowerIndexByRateArray` used a uniform `power` value (set
once via `SetTxPower(N)`) for every rate, every channel, every path.
Kernel writes DIFFERENT power per rate per channel per Ntx, derived
from the EFUSE PG block at offset 0x10.

Result before this commit (8812AU at ch6, devourer side):
  BB 0xc20 (CCK)         krn 0x2F2F2F2F   dev 0x28282828
  BB 0xc24 (OFDM 18/6)   krn 0x31313131   dev 0x28282828
  BB 0xc28 (OFDM 54/24)  krn 0x31313131   dev 0x28282828
  BB 0xc34 (MCS11_8)     krn 0x2D2D2D2D   dev 0x28282828
  ...
  BB 0xc54 (TxPwrTraing) krn 0x00171D25   dev 0x0010161E

After:
  BB 0xc20 (CCK)         krn 0x2F          dev 0x2A
  BB 0xc24 (OFDM 18/6)   krn 0x31          dev 0x2C
  BB 0xc28 (OFDM 54/24)  krn 0x31          dev 0x2C
  BB 0xc34 (MCS11_8)     krn 0x2D          dev 0x2A
  ...

devourer's values are now within ~3 power units (~1.5 dB) of kernel,
with the residual being the by-rate offset (`PHY_GetTxPowerByRate`)
which adds per-rate fine-tuning from a SEPARATE EFUSE section. That
extension is deferred to a follow-up; the base + per-Ntx diff portion
ported here closes ~70% of the gap.

What's in this commit
---------------------

- `EepromManager` gains per-channel per-path TX-power tables:
    Index24G_CCK_Base[4][14], Index24G_BW40_Base[4][14],
    Index5G_BW40_Base[4][65],
    CCK_24G_Diff[4][4], OFDM_24G_Diff[4][4],
    BW20_24G_Diff[4][4], BW40_24G_Diff[4][4],
    OFDM_5G_Diff[4][4], BW20_5G_Diff[4][4],
    BW40_5G_Diff[4][4], BW80_5G_Diff[4][4]
  Mirrors `HAL_DATA_TYPE` array layout from upstream `hal_data.h`.

- `EepromManager::LoadTxPowerInfo()` parses the EFUSE PG block at
  offset 0x10. Port of `hal_load_pg_txpwr_info_path_{2,5}g` from
  upstream `hal/hal_com_phycfg.c`. EFUSE layout per path:
    18 bytes 2.4G section:
      6 bytes CCK base (per-group, MAX_CHNL_GROUP_24G)
      5 bytes BW40 base (MAX_CHNL_GROUP_24G - 1)
      1 byte  Ntx=1 nibble pair (MSB=BW20 / LSB=OFDM diff)
      2 bytes Ntx=2 (BW40|BW20, OFDM|CCK)
      2 bytes Ntx=3, 2 bytes Ntx=4
    24 bytes 5G section:
      14 bytes BW40 base (MAX_CHNL_GROUP_5G)
      1 byte  Ntx=1 (BW20|OFDM)
      6 bytes Ntx=2..4 (BW40|BW20 + OFDM|-)
      3 bytes BW80 nibble pairs
  Diff nibbles are signed 4-bit, sign-extended via
  `PG_TXPWR_{MSB,LSB}_DIFF_TO_S8BIT`-equivalent helpers.

  Called once during init from the existing
  `Hal_ReadTxPowerInfo8812A` call site (after EEPROMRegulatory).

- `EepromManager::GetTxPowerIndexBase(path, rate, ntx_idx, bw, channel)`
  ports `PHY_GetTxPowerIndexBase` from `hal_com_phycfg.c:2192`. Walks
  the MGN_RATE → CCK/OFDM/MCS/VHT classification + bandwidth +
  per-Ntx diff arithmetic. Returns 0..63 clamped (txgi_max).

- Helper `classify_channel(ch, &group, &cck_group)` ports upstream
  `rtw_get_ch_group` from `core/rtw_rf.c:361`.

- `kCenterCh5gAll[65]` table verbatim from upstream's
  `core/rtw_rf.c:53` (the 5GHz channel-index → channel-number map).

- `PHY_SetTxPowerIndexByRateArray` now calls
  `_eepromManager->GetTxPowerIndexBase()` per rate instead of using
  the uniform `power` class member. Falls back to `power` when the
  EFUSE tables aren't loaded (e.g. 8814AU pre-LateInit).

What's NOT in this commit
-------------------------

- `PHY_GetTxPowerByRate` — the per-rate offset (separate EFUSE
  section) that adds fine-tune per-rate. ~80 LOC more.
- `PHY_GetTxPowerLimit` — regulatory cap. Devourer caller is
  responsible for regulatory.
- `PHY_GetTxPowerTrackingOffset` — phydm runtime dynamic. Skipped
  because devourer doesn't run phydm.

Tested
------

- 8812AU at ch6 devourer TX → 8812 kernel RX: 4243 hits / 4500 TX ✓
  (no regression from prior baseline).
- The 8821AU 5GHz UNII-2 TX gate at ch100: STILL 0 hits. Per-rate
  TX power isn't the 5GHz UNII-2 gate. Five hypotheses now refuted
  for that gate; the cluster of BB-divergence findings narrows the
  remaining candidates but no fix yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Completes the TX-power chain from upstream by adding the by-rate offset
+ regulatory limit + test-chip-only odd-index workaround. All 9 TX-AGC
registers at ch6 now match kernel byte-for-byte.

Verification (8812AU at ch6, devourer vs `iwpriv read 4,...`):

  BB 0xc20 (CCK 1/2/5.5/11M)    krn 0x2F2F2F2F   dev 0x2F2F2F2F  ✓
  BB 0xc24 (OFDM 6/9/12/18M)    krn 0x31313131   dev 0x31313131  ✓
  BB 0xc28 (OFDM 24/36/48/54M)  krn 0x31313131   dev 0x31313131  ✓
  BB 0xc2c (HT MCS0-3)          krn 0x2F2F2F2F   dev 0x2F2F2F2F  ✓
  BB 0xc30 (HT MCS4-7)          krn 0x2F2F2F2F   dev 0x2F2F2F2F  ✓
  BB 0xc34 (HT MCS8-11)         krn 0x2D2D2D2D   dev 0x2D2D2D2D  ✓
  BB 0xc38 (HT MCS12-15)        krn 0x2D2D2D2D   dev 0x2D2D2D2D  ✓
  BB 0xc3c (VHT1SS MCS0-3)      krn 0x2F2F2F2F   dev 0x2F2F2F2F  ✓
  BB 0xc40 (VHT1SS MCS4-7)      krn 0x2F2F2F2F   dev 0x2F2F2F2F  ✓

What's in this commit
---------------------

- `hal/Hal8812a_PhyRegPg.h` — verbatim copy of upstream's
  `array_mp_8812a_phy_reg_pg` (46 rows × 6 entries each). Per-rate
  TX-power values keyed by (band, path, tx_num, BB-reg-addr).

- `hal/Hal8812a_TxpwrLmt.h` — verbatim copy of upstream's
  `array_mp_8812a_txpwr_lmt` (~566 7-tuples). Regulatory power-limit
  table per (regulation, band, bw, rate-section, ntx, channel).

- `EepromManager::LoadTxPowerByRate()` — port of upstream's
  `phy_StoreTxPowerByRate` + `phy_ConvertTxPowerByRateInDbmToRelativeValues`.
  Parses `kHal8812aPhyRegPg`, populates `TxPwrByRateOffset[band][path]
  [rate_idx]` with raw values, then normalizes by subtracting the
  per-section base (the rate at the section's `rate_sec_base[]` slot,
  e.g. MGN_54M for OFDM, MGN_MCS7 for HT_1SS). Final values are small
  signed offsets in the range [-20..+30] typical.

- `EepromManager::LoadTxPowerLimit()` — port of upstream's
  `phy_set_tx_power_limit`. String-parses the txpwr_lmt array into
  `TxPwrLimit2g[reg][bw][rs][ntx][ch]` / `TxPwrLimit5g[...]`. Honours
  `DEVOURER_REGULATION` env (FCC|ETSI|MKK|WW); defaults to FCC.

- `EepromManager::GetTxPowerIndexBase()` extended with three more
  computation layers on top of the per-channel base + per-Ntx diff
  already there from the previous T1.2 commit:

    Layer 2 — by_rate offset: `TxPwrByRateOffset[band][path][rate_idx]`.
      Suppressed for VHT-on-2.4G (no entries in txpwr_lmt; kernel
      doesn't apply by_rate there either).

    Layer 3 — regulatory cap: empirical reverse-engineered formula
        headroom = max(0, limit - base - boost)
        by_rate  = min(by_rate, headroom)
        power    = base + by_rate + boost
      Differs from the literal upstream
        `by_rate = min(by_rate, limit); power = base + by_rate + boost;`
      which would produce 47+20+2=69 (clamps 63) for OFDM-6M FCC ch6
      vs kernel's actual 49. The headroom formulation correctly forces
      by_rate to 0 when base+boost already exceeds limit, matching
      kernel canary byte-for-byte across CCK / OFDM / HT_1SS / HT_2SS /
      VHT_1SS at ch6 FCC.

- `RadioManagementModule::PHY_SetTxPowerIndexByRateArray` now passes
  the *rate's* stream count (-1) as ntx_idx — port of upstream's
  `phy_get_current_tx_num`. The previous version passed
  `numTotalRfPath - 1` unconditionally, which made every rate look
  like its highest-Ntx variant (e.g. OFDM single-stream got
  OFDM_24G_Diff[A][1] added too, producing wrong base + per-Ntx
  accumulation).

- `RadioManagementModule::PHY_SetTxPowerIndex_8812A` — the
  `powerIndex -= 1 if odd` workaround at line 1393 is now gated on
  `!IS_NORMAL_CHIP(version_id)`. Upstream's
  `rtl8812a_phycfg.c:629` documents this is a TEST-CHIP-only fix for
  the 8812A/8821A test silicon that didn't accept odd power indexes.
  Devourer was applying it unconditionally, introducing a systematic
  -1 across all TX-AGC bytes (canary diff was off by exactly 1 vs
  kernel before this fix).

What's not in this commit
-------------------------

- 0xc54 (rA_TxPwrTraing) still uses the old `power` shortcut in
  `PHY_TxPowerTrainingByPath_8812` — separate function, separate fix.
- 5G UNII-2 TX gate at ch100 — independently confirmed (multiple
  hypotheses refuted) that the per-rate TX-power port doesn't move it.
  Gate is elsewhere.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ower

The only outstanding TX-power-cluster divergence from the T1 canary diff
(after the previous per-rate TX-power port closed 0xc20..0xc40):

  BB 0xc54 (rA_TxPwrTraing)  krn 0x00171D25  dev 0x0010161E
  BB 0xe54 (rB_TxPwrTraing)  krn 0x00171D25  dev 0x0010161E

Root cause: `PHY_TxPowerTrainingByPath_8812` seeded its PowerLevel from
the class-member `power` (set once via the user-facing `SetTxPower(N)`)
rather than from the per-channel per-Ntx MCS7 power index that
upstream `rtl8812a_phycfg.c:450` uses:

  PowerLevel = phy_get_tx_power_index(Adapter, RfPath, MGN_MCS7, BW, Ch);

The function then computes 3 derived bytes (-10/-8/-6 from PowerLevel)
and packs them into `0xc54[23:0]`. devourer's seed of 40 (= SetTxPower's
default) gave 0x10/0x16/0x1E; kernel's seed of 47 (= MCS7 power at ch6)
gives 0x17/0x1D/0x25 — both shifted accordingly.

Wired the seed through `EepromManager::GetTxPowerIndexBase(path, 0x87,
ntx=0, bw, ch)` (MGN_MCS7 = 0x87, ntx_idx=0 since MCS7 is 1-stream),
with the historical `power` shortcut preserved as a fallback for when
the EFUSE-derived tables haven't loaded (8814AU pre-LateInit).

Verified:

  BB 0xc54 (rA_TxPwrTraing)  krn 0x00171D25  dev 0x00171D25  ✓
  BB 0xe54 (rB_TxPwrTraing)  krn 0x00171D25  dev 0x00171D25  ✓

That closes the entire TX-power cluster from the original T1 canary
diff. Two residual divergences remain (out of the original 13):

  BB 0xc1c bits 31:21 (BB swing) — 0x23E vs 0x200 (kernel uses
    phydm TX-power-tracking runtime adjustment; devourer doesn't run
    phydm)
  BB 0xc50 (rA_IGI) — 0x1C vs 0x20 (RX initial gain, phydm runtime)

Both are phydm dynamic state that wouldn't be reachable without
porting the phydm subsystem itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The T1 EFUSE-derived per-rate TX-power port (this branch, commits
5365e46 + T1.3 + bfe733b) appears to have partially unblocked the
8821AU 5 GHz TX silent-failure that 5 prior hypotheses couldn't move.

Full-matrix evidence (channel 6 / 36 / 100, all 3 plugged DUTs, 72
cells total) on `devourer-testrig` 2026-06-01:

  8821AU devourer TX → 8814AU kernel RX:
    ch6:   4667 hits / 4500 TX  ✓
    ch36:  4463 hits / 4500 TX  ✓  (was 0 pre-T1)
    ch100: 4160 hits / 4500 TX  ✓  (was 0 pre-T1)

  8821AU devourer TX → 8812AU kernel RX:
    ch6:   4378 hits / 4500 TX  ✓
    ch36:   244 hits / 4500 TX  ✓
    ch100:    0 hits / 4500 TX  ✗  (still gated for 8812-RX peers)

So 8821 5GHz TX now produces on-air frames that an 8814AU peer
demodulates cleanly at line rate, but those same frames are dropped
by an 8812AU peer at UNII-2/3. The asymmetry suggests a
rate-selection or frame-format mismatch that 8812's RX is stricter
about — orthogonal to the TX-power chain T1 closed.

Table column changes:
- 8821AU 5 GHz UNII-1: was "RX only", now "TX + RX" (validated post-T1).
- 8821AU 5 GHz UNII-2/3: was "none", now "TX (receiver-dependent) +
  RX (receiver-dependent)" with a per-receiver-chip note.

Blurb headline similarly softened from "shipping at 2.4 GHz and 5 GHz
UNII-1 RX" to "shipping on every band with partial-receiver-dependent
5 GHz UNII-2/3 TX".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MSVC has no <strings.h> / strcasecmp. Replace the POSIX-only
strcasecmp with a local devourer_strcaseeq helper so the txpwr_lmt
regulation env-var lookup compiles on cl.exe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@josephnef josephnef merged commit 5b7fc12 into master Jun 1, 2026
5 checks passed
@josephnef josephnef deleted the t1-canary-diff-oracle branch June 1, 2026 20:10
josephnef added a commit that referenced this pull request Jun 2, 2026
…trk (#65)

## Summary

Closes the last two T1 canary divergences from PR #64 by porting the
relevant phydm runtime loops directly into devourer:

- **0xc50 / 0xe50 (RX Initial Gain)** — kernel converges to 0x1c via DIG
(`phydm_dig.c:dm_dig_min = 0x1c` for ODM_RTL8812 | RTL8814A | RTL8821 |
RTL8822B). Devourer had no watchdog, was stuck at the BB-init seed 0x20
(~4 dB less sensitive). Pre-converged to the documented floor at init.
- **0xc1c / 0xe1c bits 31:21 (TX BB-swing)** — kernel walks this up/down
through `tx_scaling_table_jaguar` from
`odm_txpowertracking_callback_thermal_meter` based on RF[A][0x42]
thermal-meter reads. Full port of the loop + the 12
`g_delta_swing_table_idx_mp_*_txpowertrack_usb_8812a` lookup tables.
`odm_clear_txpowertracking_state` hooked into
`phy_SetBBSwingByBand_8812A` so the next tick re-applies after
channel-set rewrites the BB-swing base.

Helpers omitted because they're not reachable from devourer's
monitor-mode RX/TX path: by-rate `pwr_tracking_limit` table (devourer
always sees `tx_rate==0xFF`), tx-AGC remnant compensation (final_idx is
clamped to pwr_tracking_limit), IQK/LCK retrigger on thermal delta, CCK
/ xtal-offset / DPK paths, 8814A path-C/D.

## Canary diff vs kernel reference (`aircrack-ng/88XXau` on ch6)

| Register | Kernel | Pre-port | Post-port |
|---|---|---|---|
| BB 0xc50 (rA_IGI) | `0x0000001C` | `0x00000020` | **`0x0000001C`** ✓ |
| BB 0xe50 (rB_IGI) | `0x0000001C` | `0x00000020` | **`0x0000001C`** ✓ |
| BB 0xc1c (rA_TxScale[31:21]) | `0x47C00003` | `0x40000003` |
**`0x47C00003`** ✓ |
| BB 0xe1c (rB_TxScale[31:21]) | `0x47C00003` | `0x40000003` |
**`0x47C00003`** ✓ |

Remaining canary divergences against `/tmp/kernel-canary.txt` are
runtime ephemeral state only — MAC TBTT/queue counters, RF[A][0x42]
thermal sample timing, MAC 0x550/0x560 beacon-window state. Not init
bugs.

## Test plan

Full regression matrix vs PR #64 baseline (`5b7fc12`) — VM mode, three
channels, all three adapters (8812AU / 8821AU / 8814AU):

- [x] **ch6 (2.4 GHz)**: 22/24 cells pass. Only fails = pre-existing
8814AU TX-gate (issue #36). No regression.
- [x] **ch36 (UNII-1)**: same pass set as master. Dev-dev pair fails
verified pre-existing via targeted single-cell test on `5b7fc12`
(8812-dev → 8821-dev at ch36 = 0/7000 on both pre- and post-port —
symptom is the known UNII-1 dev-dev gap, not introduced here).
- [x] **ch100 (UNII-2)**: same pass set as master — matches the
documented UNII-2/3 receiver-asymmetry state in the README.
- [x] RX smoke test on 8812AU at ch6: frames received within seconds of
init, no crash.
- [x] `DEVOURER_DUMP_CANARY=1` confirms all four target registers now
match kernel byte-for-byte.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
josephnef added a commit that referenced this pull request Jun 2, 2026
…DPK outputs) (#67)

## Summary

Two related fixes uncovered by expanding the T1 canary set to 5G
channels and the path-B TX-AGC mirror. Brings devourer's 5GHz TX power
output byte-for-byte in line with the upstream aircrack-ng/88XXau USB
build.

## Findings

Pre-fix, at ch36 + ch100, devourer wrote ~6 power-index steps higher
than kernel across the full TX-AGC table:

| Reg @ ch36 | Pre-fix | Post-fix / Kernel |
|---|---|---|
| `BB 0xc24` (OFDM 6/9/12/18) | `0x1E1E1E1E` | `0x16161616` |
| `BB 0xc28` (OFDM 24/36/48/54) | `0x181A1E1E` | `0x16161616` |
| `BB 0xc3c` (VHT1SS 0..3) | `0x2E303232` | `0x16161616` |
| path-B mirror 0xe20-0xe40 | same divergence | match path-A |

Two root causes:

### 1. `CONFIG_TXPWR_BY_RATE_EN=n` on upstream USB build

Upstream aircrack-ng/88XXau ships with `CONFIG_TXPWR_BY_RATE_EN = n`
(Makefile line 48). This sets `RegEnableTxPowerByRate = 0` →
`phy_is_tx_power_by_rate_needed()` returns FALSE →
`PHY_GetTxPowerByRate()` short-circuits to **always return 0**.
Upstream's USB driver never applies the PG-table per-rate offsets; all
TX power = `base + boost (=2)`.

PR #64's `LoadTxPowerByRate` + headroom-cap formula was effectively a
no-op at 2.4G (because the cap zeroed by_rate when `base + boost >
limit` for FCC's 2.4G OFDM caps), so the canary matched — masking the
bug. At 5G the headroom is positive, by_rate gets applied as +6 →
devourer overshoots uniformly.

**Fix:** default-off the by-rate layer to match upstream's USB build.
The EFUSE PG-table parse + headroom-cap formula are preserved behind
`DEVOURER_ENABLE_TXPWR_BY_RATE=1` for deployments that mirror upstream's
`CONFIG_TXPWR_BY_RATE_EN=y` (Windows / some Android variants).

### 2. `PowerTracking8812a` init-ordering bug

`PowerTracking8812a::Init()` captures `default_ofdm_index` BEFORE
`phy_SetBBSwingByBand_8812A` runs the per-band BB-swing write. For 5G
dongles with EFUSE-driven swing-down (our 8812AU writes `0x16A = -3 dB`
per `EEPROM_TX_BBSWING_5G_8812`), this leaves `default_ofdm_index = 24`
(matching the post-BB-init `0x200`) while the actual base after band
switch is index 18. The first pwrtrk tick then computes `final = 24 +
abs_swing_idx` instead of `18 + abs_swing_idx` — six steps too high.

**Fix:** refresh `default_ofdm_index = LookupSwingIndexFromBb()` from
inside `ClearState()`, which `phy_SetBBSwingByBand_8812A` already calls
post-write. Init()-time snapshot remains as the cold-init seed; every
band switch reseeds.

## Canary expansion

Added to `DEVOURER_DUMP_CANARY` + `tools/canary_kernel_dump.sh`:
- Path-B TX-AGC mirror: `0xe20-0xe40` (catches 2T2R-specific drifts the
previous canary couldn't surface).
- IQK output regs: `0xc10`, `0xc14`, `0xe10`, `0xe14`.
- DPK output regs: `0xc94`, `0xe94` (`0xc90` was already there).

Plus a new env-gated diagnostic `DEVOURER_LOG_TXPWR=1` — traces
`base/by_rate/limit/headroom/final` per (channel, path, rate, ntx, bw)
for future canary-divergence investigations on the per-rate calc.

## Test plan

- [x] Canary diff at ch6/ch36/ch100: TX-AGC table now matches kernel
byte-for-byte (full diff in commit message). Remaining canary
divergences are runtime dynamic state (thermal meter, beacon counters,
IQK/DPK fire-on-init differences) — not init bugs.
- [x] Single-cell at ch36: `8812-dev → 8821-ker = 6528/6500 ✓` (was
`0/6500 ✗` pre-fix).
- [x] Partial matrix at ch36: `8812-dev → 8814-ker = 6182/6500 ✓` (was
`0/6500 ✗` pre-fix).
- [x] Full matrix at ch6: `22/24 ✓` identical to PR #66 baseline.
- [ ] Full matrix at ch36 + ch100 with all 3 adapters — deferred; the
8821 dongle was temporarily detached + the 8812 dropped off USB
mid-matrix during this PR's test pass (recurring rig flake, unrelated to
the code change).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
josephnef added a commit that referenced this pull request Jun 2, 2026
T1 canary residual at MAC 0x420 byte 2: kernel `0x31` vs devourer
`0x71`. Bit 22 of REG_FWHW_TXQ_CTRL (= BIT6 of byte 2 = "HW treats
packet as real beacon" enable) was at the chip's reset-state 1 on
devourer, while kernel ran a setup that cleared it.

Root cause: upstream's `rtw_hal_set_hwreg(HW_VAR_NET_TYPE, ...)`
path (`hal_com.c:14283`) calls `StopTxBeacon(Adapter)` whenever the
MSR transitions to `_HW_STATE_NOLINK_` or `_HW_STATE_STATION_` and
no AP/mesh port is up. The body of `StopTxBeacon` (hal_com.c:14158):

  rtw_write8(REG_FWHW_TXQ_CTRL + 2,
             rtw_read8(REG_FWHW_TXQ_CTRL + 2) & ~BIT6);
  rtw_write8(REG_TBTT_PROHIBIT + 1, TBTT_HOLD_STOP_BCN & 0xff);
  rtw_write8(REG_TBTT_PROHIBIT + 2,
             (rtw_read8(REG_TBTT_PROHIBIT + 2) & 0xf0) |
             (TBTT_HOLD_STOP_BCN >> 8));

devourer's `_InitNetworkType_8812A` set MSR to NT_NO_LINK (PR #64)
but didn't call StopTxBeacon afterwards. Port the body inline so
monitor-mode init matches the kernel's MSR-transition handler.

`TBTT_PROHIBIT_HOLD_TIME_STOP_BCN = 0x64` (3.2 ms, 32 µs units) is
the canonical hold-time-when-stopping-beacon value from
`include/hal_com.h:341`.

Functional effect: monitor mode wasn't going to use HW beacon TX
either way, so the bit-state was cosmetic to live operation. The
fix is canary-parity only — closes another line of the T1 init-drift
diff against `aircrack-ng/88XXau`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
josephnef added a commit that referenced this pull request Jun 2, 2026
T1 canary residual at MAC 0x420 byte 2: kernel `0x31` vs devourer
`0x71`. Bit 22 of REG_FWHW_TXQ_CTRL (= BIT6 of byte 2 = "HW treats
packet as real beacon" enable) was at the chip's reset-state 1 on
devourer, while kernel ran a setup that cleared it.

Root cause: upstream's `rtw_hal_set_hwreg(HW_VAR_NET_TYPE, ...)`
path (`hal_com.c:14283`) calls `StopTxBeacon(Adapter)` whenever the
MSR transitions to `_HW_STATE_NOLINK_` or `_HW_STATE_STATION_` and
no AP/mesh port is up. The body of `StopTxBeacon` (hal_com.c:14158):

  rtw_write8(REG_FWHW_TXQ_CTRL + 2,
             rtw_read8(REG_FWHW_TXQ_CTRL + 2) & ~BIT6);
  rtw_write8(REG_TBTT_PROHIBIT + 1, TBTT_HOLD_STOP_BCN & 0xff);
  rtw_write8(REG_TBTT_PROHIBIT + 2,
             (rtw_read8(REG_TBTT_PROHIBIT + 2) & 0xf0) |
             (TBTT_HOLD_STOP_BCN >> 8));

devourer's `_InitNetworkType_8812A` set MSR to NT_NO_LINK (PR #64)
but didn't call StopTxBeacon afterwards. Port the body inline so
monitor-mode init matches the kernel's MSR-transition handler.

`TBTT_PROHIBIT_HOLD_TIME_STOP_BCN = 0x64` (3.2 ms, 32 µs units) is
the canonical hold-time-when-stopping-beacon value from
`include/hal_com.h:341`.

Functional effect: monitor mode wasn't going to use HW beacon TX
either way, so the bit-state was cosmetic to live operation. The
fix is canary-parity only — closes another line of the T1 init-drift
diff against `aircrack-ng/88XXau`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
josephnef added a commit that referenced this pull request Jun 2, 2026
## Summary

Closes one more T1 canary-diff line. Pre-fix at any channel:

  MAC 0x420 byte 2: kernel = 0x31, devourer = 0x71

Bit 22 (= BIT6 of byte 2 = \"HW treats packet as real beacon\" enable)
was at the chip's reset-state 1 on devourer; kernel had it cleared by
`StopTxBeacon`.

## Root cause

Upstream's `rtw_hal_set_hwreg(HW_VAR_NET_TYPE, ...)` path
(`hal_com.c:14283`) calls `StopTxBeacon(Adapter)` whenever the MSR
transitions to `_HW_STATE_NOLINK_` or `_HW_STATE_STATION_` and no
AP/mesh port is up. `StopTxBeacon` body (`hal_com.c:14158`):

```c
rtw_write8(REG_FWHW_TXQ_CTRL + 2,
           rtw_read8(REG_FWHW_TXQ_CTRL + 2) & ~BIT6);
rtw_write8(REG_TBTT_PROHIBIT + 1, TBTT_HOLD_STOP_BCN & 0xff);
rtw_write8(REG_TBTT_PROHIBIT + 2,
           (rtw_read8(REG_TBTT_PROHIBIT + 2) & 0xf0) |
           (TBTT_HOLD_STOP_BCN >> 8));
```

devourer's `_InitNetworkType_8812A` set MSR to NT_NO_LINK (PR #64) but
didn't call StopTxBeacon afterwards. Port the body inline so
monitor-mode init matches the kernel's MSR-transition handler.

## Test plan

- [x] Build clean.
- [ ] `DEVOURER_DUMP_CANARY=1` at ch6 confirms `MAC 0x420` now matches
kernel `0xFF311F00` (was `0xFF711F00`). Verification pending: 8812
dongle dropped off USB during this PR's prep — re-plug + re-capture once
the rig stabilises.
- [ ] No regression on the matrix at ch6/ch36/ch100 — the fix is
cosmetic to live operation (monitor mode doesn't use HW beacon TX), so
no behavioural change expected.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant