Skip to content

monitor-less 802.11 interfaces now fail with an SIOCGIFINDEX error message #1508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
infrastation opened this issue Apr 22, 2025 · 15 comments · Fixed by #1510
Closed

monitor-less 802.11 interfaces now fail with an SIOCGIFINDEX error message #1508

infrastation opened this issue Apr 22, 2025 · 15 comments · Fixed by #1510
Labels

Comments

@infrastation
Copy link
Member

This is a regression after pull request #1362. @johnthacker, please review if you can. libpcap since commit 0ac887b returns a cryptic error if the 802.11 interface does not support monitor mode. Steps to reproduce:

  1. Take a Raspberry Pi with a built-in 802.11 interface and a recent RaspiOS flavour of Debian.
  2. Confirm that it does not support monitor mode (applies to at least models 3B and 5):
    # iw phy phy0 interface add mon0 type monitor
    command failed: Operation not supported (-95)
    
  3. Note how the packaged tcpdump/libpcap fail enabling monitor mode:
    # tcpdump --version
    tcpdump version 4.99.3
    libpcap version 1.10.3 (with TPACKET_V3)
    OpenSSL 3.0.15 3 Sep 2024
    # tcpdump -i wlan0 -I
    tcpdump: wlan0: That device doesn't support monitor mode
    
  4. Install libnl-genl-3-dev and build tcpdump/libpcap from the current master branches.
  5. Note how the snapshot fails:
    # tcpdump --version
    tcpdump version 5.0.0-PRE-GIT
    libpcap version 1.11.0-PRE-GIT (with TPACKET_V3)
    64-bit build, 64-bit time_t
    # tcpdump -i wlan0 -I
    tcpdump: SIOCGIFINDEX: No such device
    # opentest -i wlan0 -I
    opentest: wlan0: Generic error
    (SIOCGIFINDEX: No such device)
    
  6. Optionally connect a USB 802.11 adapter that supports monitor mode and see that the happy path seems to work as expected.

The immediate reason why the message is different is because given two interfaces (one without monitor mode support and one with), the code paths begin in pcap_activate_linux() and remain identical until the "Now configure the monitor interface up." block in enter_rfmon_mode(), there the ioctl(sock_fd, SIOCGIFFLAGS, &ifr) call fails if mon0 does not exist, in this case pcapint_fmt_errmsg_for_errno() sets the error message, then del_mon_if() invokes iface_get_id(), which attempts ioctl(fd, SIOCGIFINDEX, &ifr), which fails and overwrites the error message with what the user is seeing.

The root cause seems to be elsewhere because the monitor-less code path should not have reached that block and the clean-up of non-existent monitor interface should not be necessary in the first place. It seems wrong that the add_mon_if() call just before reports success in both cases. In both cases ret == 1 and type == NL80211_IFTYPE_STATION after the call to get_if_type(). Commenting the call to get_if_type() and instead setting the two variables directly (which is almost the same as reverting the commit in question) causes add_mon_if() to fail with the correct error message. So it seems that get_if_type() does not communicate with the netlink socket correctly and leaves it in a state that causes the failure to fail in add_mon_if().

@johnthacker
Copy link
Contributor

Ok, I will try to take a look.

@infrastation
Copy link
Member Author

Thank you. Additional information: for non-802.11 interfaces an attempt to enable wireless monitor mode still returns the correct error message. RPI4B can be used to reproduce the problem too.

@guyharris
Copy link
Member

guyharris commented Apr 23, 2025

get_if_type() first does

	ifindex = iface_get_id(sock_fd, device, handle->errbuf);
	if (ifindex == -1)
		return PCAP_ERROR;

to try to get the interface index of the specified device. Does that succeed or does it fail?

@johnthacker
Copy link
Contributor

Unfortunately I don't have any devices that have the unexpected behavior. I can push a WIP branch that can add some debugging, if you can take a look.

johnthacker added a commit to johnthacker/libpcap that referenced this issue Apr 23, 2025
@johnthacker
Copy link
Contributor

If you build libpcap with that commit and link with tcpdump, this is the result I get with an interface that does support monitor mode (with some of the payload bytes omitted):

./tcpdump -i wlp1s0f0u2i3 -I > /dev/null
--------------------------   BEGIN NETLINK MESSAGE ---------------------------
  [NETLINK HEADER] 16 octets
    .nlmsg_len = 240
    .type = 38 <0x26>
    .flags = 0 <>
    .seq = -1745378030
    .port = 373402370
  [GENERIC NETLINK HEADER] 4 octets
    .cmd = 7
    .version = 1
    .unused = 0
  [PAYLOAD] 220 octets
    08 00 03 00 03 00 00 00 11 00 04 00 77 6c 70 31 ............wlp1
    73 30 66 30 75 32 69 33 00 00 00 00 08 00 01 00 s0f0u2i3........
...
---------------------------  END NETLINK MESSAGE   ---------------------------
type 2
name wlp1s0f0u2i3
index 3
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on wlp1s0f0u2i3, link-type IEEE802_11_RADIO (802.11 plus radiotap header), snapshot length 262144 bytes

@infrastation
Copy link
Member Author

# opentest -i wlan0 -I
 [pcap-linux.c:643 get_if_type] ifindex == 3
opentest: wlan0: Generic error
# opentest -i wlan1 -I
 [pcap-linux.c:643 get_if_type] ifindex == 5
wlan1 opened successfully

(wlan0 is the built-in interface w/o monitor, wlan1 is a USB adapter w/monitor). If you need, later today I can rig a temporary spare Pi up and give you remote access to it.

@johnthacker
Copy link
Contributor

Remote access to a test system would be helpful, I think, as I haven't been able to replicate the issue.

@infrastation
Copy link
Member Author

Done. Guy, I have another spare device that you could use for tests if you need.

@johnthacker
Copy link
Contributor

OK, I logged in, and I noticed a strange thing. We are sending a nl80211 single message (not a multipart number), so there should be a single response. However, if I try to read all the messages from the socket, the successful reply is followed by a extra netlink message of type "ERROR", reporting error code 0, for success, which I don't see on my system:

--------------------------   BEGIN NETLINK MESSAGE ---------------------------
  [NETLINK HEADER] 16 octets
    .nlmsg_len = 144
    .type = 28 <0x1c>
    .flags = 0 <>
    .seq = 1745627169
    .port = 1040195117
  [GENERIC NETLINK HEADER] 4 octets
    .cmd = 7
    .version = 1
    .unused = 0
  [PAYLOAD] 124 octets
    08 00 03 00 03 00 00 00 0a 00 04 00 77 6c 61 6e ............wlan
    30 00 00 00 08 00 01 00 00 00 00 00 08 00 05 00 0...............
    02 00 00 00 0c 00 99 00 01 00 00 00 00 00 00 00 ................
    0a 00 06 00 d8 3a dd 13 c9 5d 00 00 08 00 2e 00 .....:...]......
    06 00 00 00 05 00 53 00 00 00 00 00 08 00 26 00 ......S.......&.
    3c 14 00 00 08 00 22 01 00 00 00 00 08 00 27 00 <.....".......'.
    01 00 00 00 08 00 9f 00 01 00 00 00 08 00 a0 00 ................
    3c 14 00 00 08 00 62 00 1c 0c 00 00             <.....b.....
---------------------------  END NETLINK MESSAGE   ---------------------------
--------------------------   BEGIN NETLINK MESSAGE ---------------------------
  [NETLINK HEADER] 16 octets
    .nlmsg_len = 36
    .type = 2 <ERROR>
    .flags = 256 <ROOT>
    .seq = 1745627169
    .port = 1040195117
  [ERRORMSG] 20 octets
    .error = 0 "Success"
  [ORIGINAL MESSAGE] 16 octets
    .nlmsg_len = 16
    .type = 28 <0x1c>
    .flags = 5 <REQUEST,ACK>
    .seq = 1745627169
    .port = 1040195117
---------------------------  END NETLINK MESSAGE   ---------------------------

The netlink kernel documentation claims:

"Note that unless the NLM_F_ACK flag is set on the request Netlink will not respond with NLMSG_ERROR if there is no error. To avoid having to special-case this quirk it is recommended to always set NLM_F_ACK."

So, obviously that "quirk" is not happening on the RPi (but is on my box), but then again the solution presents itself: set NLM_F_ACK on the request and look for it regardless.

@infrastation
Copy link
Member Author

Thank you for looking. Does your hardware respond the same way to iw phy phy0 interface add mon0 type monitor?

@johnthacker
Copy link
Contributor

My hardware works. In any case, some more looking suggests that perhaps despite the above comment, Auto-Ack tends to be set on sockets by default, making the flag unnecessary. Adding a line to wait for the ack seems to fix it on the RPi and still works on my box, so I'll do that.

johnthacker added a commit to johnthacker/libpcap that referenced this issue Apr 26, 2025
The Linux kernel documentation for Netlink says:

"Note that unless the NLM_F_ACK flag is set on the request Netlink will
not respond with NLMSG_ERROR if there is no error. To avoid having to
special-case this quirk it is recommended to always set NLM_F_ACK."
(https://docs.kernel.org/userspace-api/netlink/intro.html#nl-msg-type)

Some drivers, e.g. the built in interface on a Raspberry Pi on a
recent Debian build, appear to send a NLMSG_ERROR with an error
code of 0 (Success) regardless, however. Perhaps this is to avoid
having to special-case whether to expect an ACK.

Regardless, follow instructions and always set NLM_F_ACK, and
receive messages until we get the ACK, to avoid having to special-case
failure and deal with drivers that don't have the "quirk."

Fix the-tcpdump-group#1508

Signed-off-by: John Thacker <[email protected]>
johnthacker added a commit to johnthacker/libpcap that referenced this issue Apr 26, 2025
The Linux kernel documentation for Netlink says:

"Note that unless the NLM_F_ACK flag is set on the request Netlink will
not respond with NLMSG_ERROR if there is no error. To avoid having to
special-case this quirk it is recommended to always set NLM_F_ACK."
(https://docs.kernel.org/userspace-api/netlink/intro.html#nl-msg-type)

In other places, it suggests that sockets do have Auto-ACK enabled
by default, however.

Regardless, follow instructions and always set NLM_F_ACK, and
receive messages until we get the ACK.

Fix the-tcpdump-group#1508

Signed-off-by: John Thacker <[email protected]>
@infrastation
Copy link
Member Author

Thank you. I have not worked with netlink for many years, but I can run a few additional tests tomorrow to see if everything works as expected.

@infrastation
Copy link
Member Author

Got a bit delayed by various stuff, but I am getting there.

johnthacker added a commit to johnthacker/libpcap that referenced this issue May 18, 2025
The Linux kernel documentation for Netlink says:

"Note that unless the NLM_F_ACK flag is set on the request Netlink will
not respond with NLMSG_ERROR if there is no error. To avoid having to
special-case this quirk it is recommended to always set NLM_F_ACK."
(https://docs.kernel.org/userspace-api/netlink/intro.html#nl-msg-type)

In other places, it suggests that sockets do have Auto-ACK enabled
by default, however, which does seem to be the case (and we currently
don't set it elsewhere in the code.)

Receive messages until we get the ACK.

Fix the-tcpdump-group#1508

Signed-off-by: John Thacker <[email protected]>
@infrastation
Copy link
Member Author

Fixed in the master branch; the following commits are ready to be cherry-picked into libpcap-1.10: 8862ca3, 575768e, 258e6ca, 186f908.

@infrastation
Copy link
Member Author

Cherry-picked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

3 participants