-
Notifications
You must be signed in to change notification settings - Fork 45
Allow limited inbound ICMP to Nexus, add ICMP type/code filters to firewall rules #8194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5594fc0
to
38bd2ff
Compare
That seems to be be the primary reason for the entire test suite having a Very Bad Time.
Thanks, CRDB.
I've setup these bits on On the actual MTU dscovery front, it looks as though both this PR and If from the web console I perform a 'Reload (Override Cache)' we can kick off large enough TCP transfers to run into LSO. When we trace this through: DublinBRM23230018 # dtrace -n 'xde_mc_tx:entry{ this->m = (mblk_t *)arg1; this->mss = this->m->b_datap->db_struioun.cksum.pad } xde_mc_tx:entry/this->m->b_datap->db_struioun.cksum.flags && this-
>mss != 0xBADD && this->mss/{printf("packet with MSS %d", this->m->b_datap->db_struioun.cksum.pad);}'
dtrace: description 'xde_mc_tx:entry' matched 2 probes
CPU ID FUNCTION:NAME
111 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
1 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
69 5301 xde_mc_tx:entry packet with MSS 1368
^C
BRM23230018 # dtrace -n 't4_eth_tx:entry{ this->m = (mblk_t *)arg1; this->mss = this->m->b_datap->db_struioun.cksum.pad } t4_eth_tx:entry/this->m->b_datap->db_struioun.cksum.flags && this-
>mss != 0xBADD && this->mss/{printf("packet with MSS %d", this->m->b_datap->db_struioun.cksum.pad);}'
dtrace: description 't4_eth_tx:entry' matched 2 probes
CPU ID FUNCTION:NAME
9 6360 t4_eth_tx:entry packet with MSS 8928
34 6360 t4_eth_tx:entry packet with MSS 8928
117 6360 t4_eth_tx:entry packet with MSS 8928
120 6360 t4_eth_tx:entry packet with MSS 8928
110 6360 t4_eth_tx:entry packet with MSS 8928
120 6360 t4_eth_tx:entry packet with MSS 8928
86 6360 t4_eth_tx:entry packet with MSS 8928
86 6360 t4_eth_tx:entry packet with MSS 8928
77 6360 t4_eth_tx:entry packet with MSS 8928
78 6360 t4_eth_tx:entry packet with MSS 8928
87 6360 t4_eth_tx:entry packet with MSS 1368
8 6360 t4_eth_tx:entry packet with MSS 8928
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
78 6360 t4_eth_tx:entry packet with MSS 8928
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
43 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
43 6360 t4_eth_tx:entry packet with MSS 1368
43 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
78 6360 t4_eth_tx:entry packet with MSS 8928
69 6360 t4_eth_tx:entry packet with MSS 1368
69 6360 t4_eth_tx:entry packet with MSS 1368
43 6360 t4_eth_tx:entry packet with MSS 1368
43 6360 t4_eth_tx:entry packet with MSS 1368
^C The latter query is also catching underlay traffic, hence the 8928B entries. An MSS of 1368 matches what I'd maybe expect for a reduced MTU of 1402 bytes (resp. 1448 for a 1500B MTU). Do we have any hits on the new rules? BRM23230018 # opteadm dump-layer -p opte0 firewall
Port opte0 - Layer firewall
======================================================================
Inbound Flows
----------------------------------------------------------------------
PROTO SRC IP SPORT DST IP DPORT HITS ACTION
TCP 172.20.17.94 50056 172.30.2.6 443 0 no-op
TCP 172.20.17.94 50062 172.30.2.6 443 0 no-op
Outbound Flows
----------------------------------------------------------------------
PROTO SRC IP SPORT DST IP DPORT HITS ACTION
TCP 172.30.2.6 443 172.20.17.94 50056 1 no-op
TCP 172.30.2.6 443 172.20.17.94 50062 1 no-op
Inbound Rules
----------------------------------------------------------------------
ID PRI HITS PREDICATES ACTION
126 65534 2 inner.ip.proto=TCP "Stateful Allow"
inner.ulp.dst=80,=443
125 65534 0 inner.icmp.type=time exceeded "Stateful Allow"
124 65534 0 inner.icmp.type=destination unreachable "Stateful Allow"
inner.icmp.code=4
DEF -- 427 -- "deny"
Outbound Rules
----------------------------------------------------------------------
ID PRI HITS PREDICATES ACTION
DEF -- 2 -- "stateful allow" Apparently not, although it's hard to tell given that the reconcile task is completely reinstating the rules periodically so we're losing those counters. Note that all the DogfoodBRM44220011 # dtrace -n 'xde_mc_tx:entry{ this->m = (mblk_t *)arg1; this->mss = this->m->b_datap->db_struioun.cksum.pad } xde_mc_tx:entry/this->m->b_datap->db_struioun.cksum.flags && this-
>mss != 0xBADD && this->mss/{printf("packet with MSS %d", this->m->b_datap->db_struioun.cksum.pad);}'
dtrace: description 'xde_mc_tx:entry' matched 2 probes
CPU ID FUNCTION:NAME
41 4868 xde_mc_tx:entry packet with MSS 1448
9 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
127 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
68 4868 xde_mc_tx:entry packet with MSS 1368
83 4868 xde_mc_tx:entry packet with MSS 1368
83 4868 xde_mc_tx:entry packet with MSS 1368
83 4868 xde_mc_tx:entry packet with MSS 1368
83 4868 xde_mc_tx:entry packet with MSS 1368
^C
BRM44220011 # dtrace -n 't4_eth_tx:entry{ this->m = (mblk_t *)arg1; this->mss = this->m->b_datap->db_struioun.cksum.pad } t4_eth_tx:entry/this->m->b_datap->db_struioun.cksum.flags && this->mss != 0xBADD && this->mss/{printf("packet with MSS %d", this->m->b_datap->db_struioun.cksum.pad);}'
dtrace: description 't4_eth_tx:entry' matched 2 probes
CPU ID FUNCTION:NAME
18 5927 t4_eth_tx:entry packet with MSS 8928
52 5927 t4_eth_tx:entry packet with MSS 8928
52 5927 t4_eth_tx:entry packet with MSS 1448
113 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
76 5927 t4_eth_tx:entry packet with MSS 8928
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1448
44 5927 t4_eth_tx:entry packet with MSS 1448
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
18 5927 t4_eth_tx:entry packet with MSS 1368
18 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
120 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
^C
76 5927 t4_eth_tx:entry packet with MSS 8928
95 5927 t4_eth_tx:entry packet with MSS 1368
78 5927 t4_eth_tx:entry packet with MSS 1368
43 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
43 5927 t4_eth_tx:entry packet with MSS 1368
43 5927 t4_eth_tx:entry packet with MSS 1368
60 5927 t4_eth_tx:entry packet with MSS 1368
43 5927 t4_eth_tx:entry packet with MSS 1368
60 5927 t4_eth_tx:entry packet with MSS 1368
60 5927 t4_eth_tx:entry packet with MSS 1368
1 5927 t4_eth_tx:entry packet with MSS 1368
113 5927 t4_eth_tx:entry packet with MSS 1368
1 5927 t4_eth_tx:entry packet with MSS 1368
69 5927 t4_eth_tx:entry packet with MSS 1368
86 5927 t4_eth_tx:entry packet with MSS 1368
52 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
42 5927 t4_eth_tx:entry packet with MSS 1368
42 5927 t4_eth_tx:entry packet with MSS 1368
42 5927 t4_eth_tx:entry packet with MSS 1368
103 5927 t4_eth_tx:entry packet with MSS 1368
19 5927 t4_eth_tx:entry packet with MSS 1368
19 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368
68 5927 t4_eth_tx:entry packet with MSS 1368 There's a mix in there, but I assume we have something on the lab-side hitting the API periodically, since I see those 1448 entries passively. But dogfood appears to be discovering and using the reduced MTU just fine. Fixing oxidecomputer/opte#748 may have been sufficient to get this working, and that PR has been pulled into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good for review, I've pointed out some open design points I'm unsure on.
// Port assignments are incompatible with non | ||
// TCP/UDP protocols. | ||
if matches!( | ||
proto, | ||
ProtoFilter::Tcp | ProtoFilter::Udp | ||
) { | ||
filters.set_ports(ports.clone()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised we haven't run into this one yet -- if someone were to install a rule on [TCP, ICMP] x [port 80, port 443]
this would previously be installed as the two rules:
- proto = TCP && (port = 80 || port = 443), [valid]
- proto = ICMP && (port = 80 || port = 443). [invalid]
In OPTE, a port match will always return false unless it is applied to TCP/UDP traffic.
targets: vec![VpcFirewallRuleTarget::Subnet( | ||
super::vpc_subnet::NEXUS_VPC_SUBNET.name().clone(), | ||
)], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to include any more services here?
protocols: Some(vec![ | ||
VpcFirewallRuleProtocol::Icmp(Some(VpcFirewallIcmpFilter { | ||
// Type 3 -- Destination Unreachable | ||
icmp_type: 3, | ||
// Code 4 -- Fragmentation needed | ||
code: Some(4.into()), | ||
})), | ||
VpcFirewallRuleProtocol::Icmp(Some(VpcFirewallIcmpFilter { | ||
// Type 11 -- Time Exceeded | ||
icmp_type: 11, | ||
code: None, | ||
})), | ||
]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gives us PMTUD (although it looks as though the TCP stack can figure it out in the absence of Fragmentation Needed messages?), and the ability to use traceroute
from Nexus/DNS. Is the latter needed? Are there any other ICMP families which would be crucial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jclulow Do you have any thoughts here on any missing types/codes which might be needed (or on whether to extend the scope to, e.g., external DNS)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the Network sync today, the disposition seems to be a definite yes for extending this to external DNS (making sure we don't have any poor interactions with remote IP filters), and possibly allowing through ICMP redirect messages. I'll investigate the latter for a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rcgoodfellow I've extended the set here to include ICMP redirects and port unreachable messages (on top of the ones discussed initially), which I think is a good minimum set.
For both classes of zone, this ignores the nexus host IP filter for the same reason that we can't apply it to external DNS (i.e., it's very hard to know a priori which nodes can/will send us error messages). I've put a comment explaining this where this logic occurs today:
omicron/nexus/networking/src/firewall_rules.rs
Lines 320 to 325 in 63e1f70
// Note that inbound ICMP is not currently governed by this filter, | |
// as error-type ICMP messages can arrive from any host which a | |
// Nexus/DNS zone reaches out to *or* a gateway/router on the path to | |
// that destination. | |
(None, Some(allowed_ips)) => { | |
if allowlist_applies_to_firewall_rule(rule) { |
Looks like you need to re-run |
This change reduces the binary size of opteadm and brings in the deadlock fix for oxidecomputer/opte#758. Backports a few pieces from #8194 to support some minor rework of the ioctl interface. - [x] Swap back to maghemite main. - [x] Await creation of lab-helios image.
A few types (target & filter specifications, ranges) associated with firewall rules and routes rely on a custom string-based representation in CRDB. However, when deserializing them today we end up cycling through `PgValue`->`String`->`ParsedEnumType`, needlessly allocating and copying strings per element. This PR prevents us from doing so by generating these `impl`s for any type which implements the existing `DatabaseString` trait. While there are more types in `nexus-db-model` using `FromSql<Text, _>`, most seem to make use of a `String` internally regardless, so there's no immediate benefit (other than consistency) in changing them up at this time. Motivated by some of the discussion in #8194.
A few types (target & filter specifications, ranges) associated with firewall rules and routes rely on a custom string-based representation in CRDB. However, when deserializing them today we end up cycling through `PgValue`->`String`->`ParsedEnumType`, needlessly allocating and copying strings per element. This PR prevents us from doing so by generating these `impl`s for any type which implements the existing `DatabaseString` trait. While there are more types in `nexus-db-model` using `FromSql<Text, _>`, most seem to make use of a `String` internally regardless, so there's no immediate benefit (other than consistency) in changing them up at this time. Motivated by some of the discussion in #8194.
Redirect came out of our earlier discussion, port unreachable should be useful in the limit too.
A few types (target & filter specifications, ranges) associated with firewall rules and routes rely on a custom string-based representation in CRDB. However, when deserializing them today we end up cycling through `PgValue`->`String`->`ParsedEnumType`, needlessly allocating and copying strings per element. This PR prevents us from doing so by generating these `impl`s for any type which implements the existing `DatabaseString` trait. While there are more types in `nexus-db-model` using `FromSql<Text, _>`, most seem to make use of a `String` internally regardless, so there's no immediate benefit (other than consistency) in changing them up at this time. Motivated by some of the discussion in #8194.
@hawkw Checking in on this, are we happy with this from an API/DB/Control-plane point of view? I'll fixup the merge conflict once dogfood's updated.
@taspelund / @rcgoodfellow , are we happy with this as a starting point for ICMP? I appreciate there's more we can do on giving operators precise control inclusive of e.g. ping (https://github.com/oxidecomputer/customer-support/issues/334), but then we need to worry more about the interaction with nexus IP allow lists etc. |
Yup, all good on my end! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me provided everything seems right from the networking perspective!
To make sure I understand this correctly, we're only letting DU through to Nexus? We must be careful not to introduce changes that would make Nexus more visible to scanners and the like. |
The set here is:
|
This reverts commit 5a5308f. It looks as though something is gummed up in pkg, since 0.37.381 is the last available version there.
This PR adds more detail to firewall protocol filters. Rather than being specified purely as a string->enum mapping, each protocol filter is now a standard tagged enum like many other types in the Nexus API. Here, this allows us to express more control over ICMP traffic. For instance:
[{"type": "icmp"}]
),[{"type": "icmp", "value": { icmp_type: 0 }}, {"type": "icmp", "value": { icmp_type: 8 }}]
){"type": "icmp", "value": { icmp_type: 0, code: "0-4" }}
).Building on this, Nexus now has a firewall entry to permit the receipt of e.g., Destination Unreachable messages. I've added a new system endpoint to control whether this is enabled or disabled.
As part of this, I've converted the CRDB representation from a fixed
ENUM[]
to aSTRING(32)[]
. There are a couple of reasons for this:Should close #7998.
TODO:
main
.