-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix regression in source IP spoofing #1085
Fix regression in source IP spoofing #1085
Conversation
We were using the `cfg` in pool since istio#1040. `cfg` is not whether its used, but rather the intent of the user -- we may enable or disable at runtime. This renames the field to make it clear it should not be used like this, and fixes the issue. This went through CI due to istio#1084.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct but I left a few comments where I think it could be better/clearer. added a hold so you can choose address comments. Feel free to drop the hold if you don't want to address.
src/proxy/inbound.rs
Outdated
let enable_orig_src = pi | ||
.cfg | ||
.explicitly_configure_original_source | ||
.unwrap_or(transparent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kind of off topic but:
What is the point of enable_orig_src and this second set of logic? It seems like we shouldn't get enable_orig_src and transparent with different values ever.
if the cfg is true then we either return true or an error and if error the we return early at the ?
and shouldn't reach this line. so if the config asked for true they are both true effectively.
if cfg contains false we return false so both are false.
if cfg contains none we best effort and return the result, go through the unwrap_or to set enable_orig_src to be the value in transparent.
what am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the cfg is true then we either return true or an error and if error the we return early at the ? and shouldn't reach this line. so if the config asked for true they are both true effectively.
maybe_set_transparent
shouldn't propagate the error yea, it should swallow/log and return the bool.
EDIT: nevermind.
What is the point of enable_orig_src and this second set of logic? It seems like we shouldn't get enable_orig_src and transparent with different values ever.
there's a few states here:
orig_src == true
> if we can set transparent, then good, if we cannot, then error.
orig_src == unset
-> if we can set transparent, then inform other things we are using orig_src, otherwise inform them we are not.
tl;dr
maybe_set_transparent
might return an error, or false
- as dictated by the cfg. It's a weird sort of fallthru.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if maybe set_transparent returns an error then we never get here though. we return early at the ?
. Aside from that if maybe_set_returns an error then we are in conflict and we just treat error as meaning false so as to not return early we wind up at a different problem. The active state would be false even though the requested state is true. Setting the variable that contains the active state to be true because the user asked for it means that variable no longer actually reflects the active state so that seems wrong to me.
tldr, seems like if we want to know the active state we should just use transparent which contains the active state. I think that is what we are doing, in a round about way, so I don't see a bug in the present logic it just seems like an unnecessary step is being performed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just panic
if the config flag is set and we can't do maybe_set_transparent
because of lack of perms, then we can just use transparent
in all cases?
I'm not sure we care about any other permutations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if maybe set_transparent returns an error then we never get here though. we return early at the ?
that's true, but it will not always return an error, whether it does or not is dependent on this flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it either has an error or it contains the state that resulted when we tried to set transparent, yeah? if it's an error then enable_orig_src
doesn't matter because we never reach it. If its not an error then we really just want what is in transparent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If its not an error then we really just want what is in transparent
unless enable_orig_src was explicitly set to false
as user intent, in which case we want to discard the result of the attempt and use the user intent.
It's goofy. I think we should use 2 config fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but if the user asked for false we just don't attempt to set transparent and return false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intent (and, afaik, implementation) is:
- default, None: try to set transparent (requires privs), if it doesn't work proceed with original_src=false
- Some(true): set transparent or fail. use original src
- Some(false): don't bother trying transparent, do not use original src
The higher level intent is to make it so we gracefully degrade without privs, but can explicitly require/not
@@ -337,6 +336,7 @@ impl WorkloadHBONEPool { | |||
// Callers should then be safe to drop() the pool instance. | |||
pub fn new( | |||
cfg: Arc<crate::config::Config>, | |||
original_source: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could probably be made more clear that this is expected to reflect running state rather than user intent. Maybe just add to the comment on new. I don't really know how to better name the parameter to add that kind of information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea is to convey that thru mutability, cfg
isn't mutable, so it's by-definition fixed config/intent - you can't change it. If you want to calc something at runtime, calculate it, bind it, and pass it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree this should be a parameter. What I was attempting to "solve for" is misuse in the future by accidentally just setting this parameter to what's in the config because it's not actually all that clear what the parameter is for. Then we're sort of right back here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's why I think the config flag should just be two flags - the main problem is that the config property itself has overloaded meaning that's not obvious from looking at it: #1085 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as it is we have the advantage of not being able to specify an invalid state... what if they set require and disable to true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make it a 3 way enum - the potential for invalid state is sort of an inescapable aspect of you telling us to do A but us maybe not being able to do it because of circumstances beyond our control.
.enable_original_source | ||
.unwrap_or_default() | ||
.then_some(key.src); | ||
let local = self.original_source.then_some(key.src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh yeah I missed this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well back when config was mut I think we mutated the meaning of that field from "user intent" to "active state" so it is pretty subtle
src/config.rs
Outdated
pub enable_original_source: Option<bool>, | ||
// If set, explicitly configure whether to use original source. | ||
// If unset (recommended), this is automatically detected based on permissions. | ||
pub explicitly_configure_original_source: Option<bool>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub explicitly_configure_original_source: Option<bool>, | |
// If set, throw an error if we lack the permissions to enable original_source. | |
// If unset (recommended), use of original_source is automatically detected based on permissions. | |
pub require_original_source: Option<bool>, |
maybe this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe we just split this into two, since the preferred default is autodetection.
require_original_source
disable_original_source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or just a 3 way enum -
orig_src: enable | disable | autodetect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or just a 3 way enum
If it wasn't clear, that is what Option<bool>
is (just in a more confusing representation). Agree an enum is way more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there are two aspects to this:
- internal representation, enum is awesome if we want to go to there, I have no problem changing from
Option<bool>
to something clearer internally. - external representation, enum is basically unsupported so it's string or int... I think keeping the optional bool and adding documentation if we feel it is unclear makes sense for this.
src/proxy/inbound.rs
Outdated
let enable_orig_src = pi | ||
.cfg | ||
.explicitly_configure_original_source | ||
.unwrap_or(transparent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just panic
if the config flag is set and we can't do maybe_set_transparent
because of lack of perms, then we can just use transparent
in all cases?
I'm not sure we care about any other permutations.
I didn't check what we do with the returned error but this doesn't seem recoverable really. Perhaps we send some zds response that we can't configure or something along those lines... |
We were using the
cfg
in pool since #1040.cfg
is not whether its used, but rather the intent of the user -- we may enable or disable at runtime.This renames the field to make it clear it should not be used like this, and fixes the issue. This went through CI due to
#1084.