You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
bpf, sockmap: avoid using sk_socket after free when sending
The sk->sk_socket is not locked or referenced in backlog thread, and
during the call to skb_send_sock(), there is a race condition with
the release of sk_socket. All types of sockets(tcp/udp/unix/vsock)
will be affected.
Race conditions:
'''
CPU0 CPU1
backlog::skb_send_sock
sendmsg_unlocked
sock_sendmsg
sock_sendmsg_nosec
close(fd):
...
ops->release() -> sock_map_close()
sk_socket->ops = NULL
free(socket)
sock->ops->sendmsg
^
panic here
'''
The ref of psock become 0 after sock_map_close() executed.
'''
void sock_map_close()
{
...
if (likely(psock)) {
...
// !! here we remove psock and the ref of psock become 0
sock_map_remove_links(sk, psock)
psock = sk_psock_get(sk);
if (unlikely(!psock))
goto no_psock; <=== Control jumps here via goto
...
cancel_delayed_work_sync(&psock->work); <=== not executed
sk_psock_put(sk, psock);
...
}
'''
Based on the fact that we already wait for the workqueue to finish in
sock_map_close() if psock is held, we simply increase the psock
reference count to avoid race conditions.
With this patch, if the backlog thread is running, sock_map_close() will
wait for the backlog thread to complete and cancel all pending work.
If no backlog running, any pending work that hasn't started by then will
fail when invoked by sk_psock_get(), as the psock reference count have
been zeroed, and sk_psock_drop() will cancel all jobs via
cancel_delayed_work_sync().
In summary, we require synchronization to coordinate the backlog thread
and close() thread.
The panic I catched:
'''
Workqueue: events sk_psock_backlog
RIP: 0010:sock_sendmsg+0x21d/0x440
RAX: 0000000000000000 RBX: ffffc9000521fad8 RCX: 0000000000000001
...
Call Trace:
<TASK>
? die_addr+0x40/0xa0
? exc_general_protection+0x14c/0x230
? asm_exc_general_protection+0x26/0x30
? sock_sendmsg+0x21d/0x440
? sock_sendmsg+0x3e0/0x440
? __pfx_sock_sendmsg+0x10/0x10
__skb_send_sock+0x543/0xb70
sk_psock_backlog+0x247/0xb80
...
'''
Reported-by: Michal Luczaj <[email protected]>
Fixes: 799aa7f ("skmsg: Avoid lock_sock() in sk_psock_backlog()")
Signed-off-by: Jiayuan Chen <[email protected]>
0 commit comments