Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken allocation failure branches #129

Open
riastradh opened this issue Jan 31, 2023 · 1 comment
Open

Broken allocation failure branches #129

riastradh opened this issue Jan 31, 2023 · 1 comment
Labels

Comments

@riastradh
Copy link
Contributor

Description

npf_conn_establish (invoked from the packet-processing path in softint context) has an error branch to handle memory allocation failure in thmap_put (via npf_conndb_insert), but the error branch calls thmap_del (via npf_conndb_remove), which relies on memory allocation to succeed (rmind/thmap#11):

npf/src/kern/npf_conn.c

Lines 477 to 480 in 2efbe28

if (!npf_conndb_insert(conn_db, bk, con, NPF_FLOW_BACK)) {
npf_conn_t *ret __diagused;
ret = npf_conndb_remove(conn_db, fw);
KASSERT(ret == con);

This error branch is essentially guaranteed to crash -- see, e.g.: https://gnats.netbsd.org/57208

Environment and configuration

Environment:

  • NPF environment: NetBSD
  • Operating system version: NetBSD xxx.xxxx.net 9.2 NetBSD 9.2 (GENERIC) #0: Wed May 12 13:15:55 UTC 2021 [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC amd64
  • NPF version: NetBSD 9.2 (but the problem persists in npf master, NetBSD HEAD)

Configuration:
N/A

Any additional information

  • For userspace where allocation is never guaranteed to succeed, thmap_del callers need to be taught to handle failure and retry later.
  • For kernel where allocation can sleep, thmap_del needs to be made to sleep, and callers must defer it to thread context (with no spin locks held) where it can safely do so. npf_tableset.c needs to do this outside any spin locks.
@riastradh riastradh added the bug label Jan 31, 2023
@pettai
Copy link

pettai commented Feb 1, 2023

Here's the configuration of the crashing system, FWIW

$wired_if = "bnx0"
$wired_v4 = inet4(bnx0)
$wired_v6 = inet6(bnx0)
$lab_if = inet4(bnx1)

table <blacklist> type hash file "/etc/npf_blacklist"
table <internal> type tree file "/etc/npf_internal"

$services_tcp = { domain, ssh, http, https }
$services_udp = { domain, ntp }

$internal_tcp = { 5666 }
$internal_udp = { bootps, syslog }

alg "icmp"

procedure "log" {
        # Note: npf_ext_log kernel module should be loaded, if not built-in.
        # Also, the interface created, e.g.: ifconfig npflog0 create
        log: npflog0
}

group "wired" on $wired_if {
        block in final from <blacklist>
        pass stateful in final family inet4 proto icmp to $wired_v4
        pass in final family inet6 proto ipv6-icmp to $wired_v6
        pass stateful in final family inet4 proto tcp to $wired_v4 port ssh apply "log"
        pass in final family inet6 proto tcp to $wired_v6 port ssh apply "log"
        pass in final family inet4 proto tcp from X.X.Y.Z to $wired_v4 port 782    # ConServer
        pass in final family inet4 proto tcp from X.X.Y.Z to $wired_v4 port 60000-65535    # ConServer
        pass stateful in final family inet4 proto tcp from <internal> to $wired_v4 port $internal_tcp
        pass stateful in final family inet4 proto udp from <internal> to $wired_v4 port $internal_tcp
        pass stateful in final family inet4 proto tcp to $wired_v4 port $services_tcp
        pass in final family inet6 proto tcp to $wired_v6 port $services_tcp
        pass stateful in final family inet4 proto udp to $wired_v4 port $services_udp
        pass in final family inet6 proto udp to $wired_v6 port $services_udp
        pass stateful in final family inet4 proto tcp to $wired_v4 port 49151-65535    # Passive FTP
        pass stateful in final family inet6 proto tcp to $wired_v6 port 49151-65535    # Passive FTP
        # pass stateful in final family inet4 proto udp to $wired_v4 port 33434-33600    # Traceroute
        # pass in final family inet6 proto udp to $wired_v6 port 33434-33600    # Traceroute

        # only SYN packets need to generate state
        pass stateful out final family inet6 proto tcp flags S/SA from $wired_v6
        pass stateful out final family inet4 proto tcp flags S/SA from $wired_v4
        # pass the other tcp packets without generating extra state
        pass out final family inet6 proto tcp from $wired_v6
        pass out final family inet4 proto tcp from $wired_v4

        # all other types of traffic, generate state per packet
        pass stateful out final family inet6 from $wired_v6
        pass stateful out final family inet4 from $wired_v4
}

group "lab" {
        pass final on $lab_if all
}

group default {
        pass final on lo0 all
        block all apply "log"
}

netbsd-srcmastr pushed a commit to NetBSD/src that referenced this issue Oct 17, 2023
thmap_del can't fail, and it is used in places in npf where sleeping
is forbidden, so it can't rely on allocating memory either.

Instead of having thmap_del allocate memory on the fly for each
object to defer freeing until thmap_gc, arrange to have thmap(9)
preallocate the same storage when allocating all the objects in the
first place, with a GC header.

This is suboptimal for memory usage, especially on insertion- and
lookup-heavy but deletion-light workloads, but it's not clear rmind's
alternative (https://github.com/rmind/thmap/tree/thmap_del_mem_fail)
is ready to use yet, so we'll go with this for correctness.

PR kern/57208
rmind/npf#129

XXX pullup-10
XXX pullup-9
netbsd-srcmastr pushed a commit to NetBSD/src that referenced this issue Oct 20, 2023
	sys/kern/subr_thmap.c: revision 1.14
	sys/kern/subr_thmap.c: revision 1.15

thmap(9): Test alloc failure, not THMAP_GETPTR failure.
THMAP_GETPTR may return nonnull even though alloc returned zero.

Note that this failure branch is not actually appropriate;
thmap_create should not fail.  We really need to pass KM_SLEEP
through in this call site even though there are other call sites for
which KM_NOSLEEP is appropriate.

Adapted from: rmind/thmap#14
PR kern/57666
rmind/thmap#13

thmap(9): Preallocate GC list storage for thmap_del.
thmap_del can't fail, and it is used in places in npf where sleeping
is forbidden, so it can't rely on allocating memory either.
Instead of having thmap_del allocate memory on the fly for each
object to defer freeing until thmap_gc, arrange to have thmap(9)
preallocate the same storage when allocating all the objects in the
first place, with a GC header.

This is suboptimal for memory usage, especially on insertion- and
lookup-heavy but deletion-light workloads, but it's not clear rmind's
alternative (https://github.com/rmind/thmap/tree/thmap_del_mem_fail)
is ready to use yet, so we'll go with this for correctness.
PR kern/57208

rmind/npf#129
netbsd-srcmastr pushed a commit to NetBSD/src that referenced this issue Oct 20, 2023
	sys/kern/subr_thmap.c: revision 1.14
	sys/kern/subr_thmap.c: revision 1.15

thmap(9): Test alloc failure, not THMAP_GETPTR failure.
THMAP_GETPTR may return nonnull even though alloc returned zero.

Note that this failure branch is not actually appropriate;
thmap_create should not fail.  We really need to pass KM_SLEEP
through in this call site even though there are other call sites for
which KM_NOSLEEP is appropriate.

Adapted from: rmind/thmap#14
PR kern/57666
rmind/thmap#13

thmap(9): Preallocate GC list storage for thmap_del.
thmap_del can't fail, and it is used in places in npf where sleeping
is forbidden, so it can't rely on allocating memory either.
Instead of having thmap_del allocate memory on the fly for each
object to defer freeing until thmap_gc, arrange to have thmap(9)
preallocate the same storage when allocating all the objects in the
first place, with a GC header.

This is suboptimal for memory usage, especially on insertion- and
lookup-heavy but deletion-light workloads, but it's not clear rmind's
alternative (https://github.com/rmind/thmap/tree/thmap_del_mem_fail)
is ready to use yet, so we'll go with this for correctness.
PR kern/57208

rmind/npf#129
rokuyama pushed a commit to IIJ-NetBSD/netbsd-src that referenced this issue Oct 26, 2023
thmap_del can't fail, and it is used in places in npf where sleeping
is forbidden, so it can't rely on allocating memory either.

Instead of having thmap_del allocate memory on the fly for each
object to defer freeing until thmap_gc, arrange to have thmap(9)
preallocate the same storage when allocating all the objects in the
first place, with a GC header.

This is suboptimal for memory usage, especially on insertion- and
lookup-heavy but deletion-light workloads, but it's not clear rmind's
alternative (https://github.com/rmind/thmap/tree/thmap_del_mem_fail)
is ready to use yet, so we'll go with this for correctness.

PR kern/57208
rmind/npf#129

XXX pullup-10
XXX pullup-9
rokuyama pushed a commit to IIJ-NetBSD/netbsd-src that referenced this issue Oct 26, 2023
	sys/kern/subr_thmap.c: revision 1.14
	sys/kern/subr_thmap.c: revision 1.15

thmap(9): Test alloc failure, not THMAP_GETPTR failure.
THMAP_GETPTR may return nonnull even though alloc returned zero.

Note that this failure branch is not actually appropriate;
thmap_create should not fail.  We really need to pass KM_SLEEP
through in this call site even though there are other call sites for
which KM_NOSLEEP is appropriate.

Adapted from: rmind/thmap#14
PR kern/57666
rmind/thmap#13

thmap(9): Preallocate GC list storage for thmap_del.
thmap_del can't fail, and it is used in places in npf where sleeping
is forbidden, so it can't rely on allocating memory either.
Instead of having thmap_del allocate memory on the fly for each
object to defer freeing until thmap_gc, arrange to have thmap(9)
preallocate the same storage when allocating all the objects in the
first place, with a GC header.

This is suboptimal for memory usage, especially on insertion- and
lookup-heavy but deletion-light workloads, but it's not clear rmind's
alternative (https://github.com/rmind/thmap/tree/thmap_del_mem_fail)
is ready to use yet, so we'll go with this for correctness.
PR kern/57208

rmind/npf#129
rokuyama pushed a commit to IIJ-NetBSD/netbsd-src that referenced this issue Oct 27, 2023
	sys/kern/subr_thmap.c: revision 1.14
	sys/kern/subr_thmap.c: revision 1.15

thmap(9): Test alloc failure, not THMAP_GETPTR failure.
THMAP_GETPTR may return nonnull even though alloc returned zero.

Note that this failure branch is not actually appropriate;
thmap_create should not fail.  We really need to pass KM_SLEEP
through in this call site even though there are other call sites for
which KM_NOSLEEP is appropriate.

Adapted from: rmind/thmap#14
PR kern/57666
rmind/thmap#13

thmap(9): Preallocate GC list storage for thmap_del.
thmap_del can't fail, and it is used in places in npf where sleeping
is forbidden, so it can't rely on allocating memory either.
Instead of having thmap_del allocate memory on the fly for each
object to defer freeing until thmap_gc, arrange to have thmap(9)
preallocate the same storage when allocating all the objects in the
first place, with a GC header.

This is suboptimal for memory usage, especially on insertion- and
lookup-heavy but deletion-light workloads, but it's not clear rmind's
alternative (https://github.com/rmind/thmap/tree/thmap_del_mem_fail)
is ready to use yet, so we'll go with this for correctness.
PR kern/57208

rmind/npf#129
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants