Skip to content

Allow only CA cert to be set in command server TLS settings #1079

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: v3-fixups
Choose a base branch
from

Conversation

UnwashedMeme
Copy link
Member

@UnwashedMeme UnwashedMeme commented May 14, 2025

Proposed changes

Describe the use case and detail of the change. If this PR addresses an issue on GitHub, make sure to include a link to that issue using one of the supported keywords here in this description (not in the title of the PR).

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have read the CONTRIBUTING document
  • I have run make install-tools and have attached any dependency changes to this pull request
    • this doesn't build on a clean checkout of v3 (more below)
  • If applicable, I have added tests that prove my fix is effective or that my feature works
  • If applicable, I have checked that any relevant tests pass after adding my changes
  • If applicable, I have updated any relevant documentation (README.md)
  • If applicable, I have tested my cross-platform changes on Ubuntu 22, Redhat 8, SUSE 15 and FreeBSD 13
    • I don't think this is applicable.

@UnwashedMeme UnwashedMeme requested a review from a team as a code owner May 14, 2025 22:01
@github-actions github-actions bot added the chore Pull requests for routine tasks label May 14, 2025
Copy link
Contributor

github-actions bot commented May 14, 2025

✅ All required contributors have signed the F5 CLA for this PR. Thank you!
Posted by the CLA Assistant Lite bot.

Comment on lines 446 to 454
"Test 3: incorrect CA should not error": { // REALLY ?!
conf: &config.TLSConfig{
Ca: "customca.pem",
},
wantErr: false,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels wrong to me; the user specified a CA cert and we silently ignore it?!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree. Not sure why we aren't returning an error if the CA is invalid. I'd say update the logic to return an error instead of just logging a debug message.

Copy link
Member Author

@UnwashedMeme UnwashedMeme May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is that this should error while building the connection credentials; similar to passing an invalid client cert. I was trying not to disrupt existing behavior too much in case there is a use case I'm missing.

Edit: this comment was made before I saw yours, the page hadn't refreshed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the logic; @dhurley please recheck. Here's the compare of what's changed since you last looked.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't completely solve it, but makes the error notably more visible:

time=2025-05-21T15:41:33.547Z level=ERROR msg="Unable to add transport credentials to gRPC dial options, adding default transport credentials" error="invalid CA cert while building transport credentials: read CA file (/etc/nginx-agent/bad.crt): open /etc/nginx-agent/bad.crt: no such file or directory"

I'm willing to call that good enough -- it is now visible what's gone wrong even if it didn't crash the process.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @UnwashedMeme. Yeah that looks good to me.

When generating a self-signed cert for _unit-tests_ it doesn't need to
have high security. Using a smaller length makes the tests run faster.

The difference is ~0.1s vs ~2.0s to run one test that generates a cert.
This was incorrectly passing in the cert and key backwards which
resulted in not actually getting mTLS credentials, but instead
insecure credentials.

When insecure credentials are used, skipToken means we don't add a
addPerRPCCredentials. When it is a secure TLS credential then
addPerRPCCredentials increases the dialoption count by one.
@UnwashedMeme UnwashedMeme added the bug Something isn't working label May 15, 2025
@UnwashedMeme UnwashedMeme changed the base branch from v3 to v3-fixups May 15, 2025 19:14
@UnwashedMeme
Copy link
Member Author

I have hereby read the F5 CLA and agree to its terms.

My employer F5 has ownership of the code I am contributing to F5. ;-)

@UnwashedMeme
Copy link
Member Author

I have hereby read the F5 CLA and agree to its terms

@UnwashedMeme
Copy link
Member Author

recheck

@UnwashedMeme
Copy link
Member Author

I haven't actually been able to run this yet to verify that after this change the CA is actually respected.

Comment on lines 446 to 454
"Test 3: incorrect CA should not error": { // REALLY ?!
conf: &config.TLSConfig{
Ca: "customca.pem",
},
wantErr: false,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree. Not sure why we aren't returning an error if the CA is invalid. I'd say update the logic to return an error instead of just logging a debug message.

@dhurley dhurley changed the title V3 respect ca Allow only CA cert to be set in command server TLS settings May 21, 2025
@dhurley dhurley added the v3.x Issues and Pull Requests related to the major version v3 label May 21, 2025
This had been skipping out of the function early if a client key
wasn't specified.

I don't believe that's correct. If I[User] have specified specified a
CA cert because the MPI server I'm trying to talk to is signed by a
non-standard CA (e.g. N1 devenv) then it should be respected
regardless of whether I've configured mTLS.

Silently skipping the CA is really confusing and leads to

> Failed to create connection" error="rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate signed by unknown authority\"

I've split out the getTLSConfigForCredentials to make it easier to
test this translation. Once it is wrapped into a TransportCredential
or a DialOption it's opaque and hard to verify.
Previously if a consumer specified the CA cert to verify the command
connection but that CA wasn't valid then system would log at
Debug (default hidden) and proceed anyways.

I don't believe this is good behavior. If the consumer is directly
specifying a CA cert then that is the CA that should be used, not
silently ignored.

This patch returns the error up, which is now caught and swallowed at
a higher level, but at least it is more  visible:

> time=2025-05-21T15:41:33.547Z level=ERROR msg="Unable to add transport credentials to gRPC dial options, adding default transport credentials" error="invalid CA cert while building transport credentials: read CA file (/etc/nginx-agent/bad.crt): open /etc/nginx-agent/bad.crt: no such file or directory"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working chore Pull requests for routine tasks v3.x Issues and Pull Requests related to the major version v3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants