Skip to content

Conversation

@kangclzjc
Copy link
Contributor

@kangclzjc kangclzjc commented Aug 14, 2025

Currently we do the certificate generation ourselves. If they already have integrated cert-manager into their cluster, then Grove should give them an option to depend on that, instead

Now if I have cert-manager installed in my cluster, it works. I will add another PR to complete Makefile and related shell script. Currently, I can use helm install grove-operator . --set enableCertManager=true --set certManager.mode=internal to use cert-manager and manage cert/caBundle.

@kangclzjc kangclzjc marked this pull request as draft August 14, 2025 01:56
@kangclzjc kangclzjc self-assigned this Aug 14, 2025
@kangclzjc kangclzjc marked this pull request as ready for review August 15, 2025 07:03
@kangclzjc kangclzjc changed the title [WIP] Add cert-manager support as an option Add cert-manager support as an option Aug 15, 2025
@kangclzjc kangclzjc changed the title Add cert-manager support as an option feat: add cert-manager support as an option Aug 15, 2025
@kangclzjc kangclzjc force-pushed the certmanager branch 2 times, most recently from d126ec8 to ba972a2 Compare August 15, 2025 08:24
Signed-off-by: Kang Zhang <[email protected]>
Signed-off-by: Kang Zhang <[email protected]>
Signed-off-by: Kang Zhang <[email protected]>
Signed-off-by: kangclzjc <[email protected]>
Copy link
Contributor

@gflarity gflarity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea to add some flexibility around certificate generation, however I'd recommend we simplify this. Rather than have flags to enable certmanager, I'd recommend allowing the operator to provide a secret which contains a the certficate, the associated CA bundle. Basically the approach described her in the Elastic Cloud helm chart here.

Rationale:

it's even more flexible, perhaps someone wants to use a different cert manager that build themselves for... reasons
any operators who want to provide certmanager, they generally know what they're trying to accomplish and how. All they need is a way to point grove in the right direction
99.9% of deployments won't ever provide this and want to involved minimally in its internal operations, so no need to complicate everything for the small edge cases who do

@kangclzjc
Copy link
Contributor Author

It's a good idea to add some flexibility around certificate generation, however I'd recommend we simplify this. Rather than have flags to enable certmanager, I'd recommend allowing the operator to provide a secret which contains a the certficate, the associated CA bundle. Basically the approach described her in the Elastic Cloud helm chart here.

Rationale:

it's even more flexible, perhaps someone wants to use a different cert manager that build themselves for... reasons any operators who want to provide certmanager, they generally know what they're trying to accomplish and how. All they need is a way to point grove in the right direction 99.9% of deployments won't ever provide this and want to involved minimally in its internal operations, so no need to complicate everything for the small edge cases who do

It's a good idea, maybe it doesn't have conflict with current implementation. I just give a internal example here. For externel mode, we can allow the operator to provide a secret which contains a certficate. Because for some cases, customer may want to easily deploy it in develop mode for internal use/developement. In internal mode, we can hide the complexity of certificate generation for developer.

@gflarity
Copy link
Contributor

gflarity commented Sep 2, 2025

It's a good idea to add some flexibility around certificate generation, however I'd recommend we simplify this. Rather than have flags to enable certmanager, I'd recommend allowing the operator to provide a secret which contains a the certficate, the associated CA bundle. Basically the approach described her in the Elastic Cloud helm chart here.
Rationale:
it's even more flexible, perhaps someone wants to use a different cert manager that build themselves for... reasons any operators who want to provide certmanager, they generally know what they're trying to accomplish and how. All they need is a way to point grove in the right direction 99.9% of deployments won't ever provide this and want to involved minimally in its internal operations, so no need to complicate everything for the small edge cases who do

It's a good idea, maybe it doesn't have conflict with current implementation. I just give a internal example here. For externel mode, we can allow the operator to provide a secret which contains a certficate. Because for some cases, customer may want to easily deploy it in develop mode for internal use/developement. In internal mode, we can hide the complexity of certificate generation for developer.

I see this from two different angles and arrive at the same conclusion. I think we're already agreed on 1, but I think I can communicate 2 better.

  1. By just allowing a secret (location) and CA (value) to be specified, you cover the cert manager use case while offering even more flexibility with less complexity.

  2. Having used certmanager in production on a number of projects, I don't think certmanager is actually desirable for the webhook certificate management/rotation use case specifically. People will just use self signed certificates and it's still a bit of effort to set those up. The github.com/open-policy-agent/cert-controller/pkg/rotator solution have will meet 99% of users needs and seems robust to me. That said, I can imagine %1 edge cases/environments where it's desirable or even required to manage certicates externally. For those edge cases, the operarators (owners/users) want extreme control and might not even use certmanager. So the ability to just point to the secret location (which certmanager could own and control) is all that an operator wants anyways. I've used it this a way a bunch of times and it's perfect.

@unmarshall
Copy link
Collaborator

unmarshall commented Sep 3, 2025

@gflarity

The github.com/open-policy-agent/cert-controller/pkg/rotator solution have will meet 99% of users needs and seems robust to me.

Well, for short term we will be using the cert-controller but it has several issues which i shared on the slack. I will add those points here as well.

While we see dev activity on this project but these are minor fixes. Tickets that highlight a bigger gap are open for more than 2-3 years. This is not a very good sign for a project that we wish to have dependency on. Others like kubewarden also evaluated cert-controller and ended up having their own custom solution.

@unmarshall
Copy link
Collaborator

In my opinion there is no harm in adding support for Cert manager. The reference that you have given above (https://www.elastic.co/guide/en/cloud-on-k8s/2.12/k8s-webhook.html) provides 3 options as well (including support for cert-manager).

For production systems you should ideally have certificate rotation in place for all your webhooks. Not every consumer would like to re-invent the wheel to rotate the certificates without impacting availability (zero-down-time). So while that option can be provided to be complete not everyone would prefer that.

@gflarity
Copy link
Contributor

gflarity commented Sep 9, 2025

I definitely think cert-manager support is a good idea. But let's just keep it simple. Optional CA/Secet helm values should do the trick.

It's shame that cert-controller isn't seeing much investment... Looks like another approach is to use kube-webhook-certgen
in helmchart preinstall/postinstall scripts. Seems to work well for the Prometheus community.

@unmarshall
Copy link
Collaborator

Looks like another approach is to use kube-webhook-certgen
in helmchart preinstall/postinstall scripts. Seems to work well for the Prometheus community.

kube-webhook-certgen is fine for non-production environments as it generates certificates that last 100 years. That is not something that you would like to do in a productive environment. They also explicitly mention it on their README.md

This tool may not be adequate in all security environments. If a more complete solution is required, you may want to seek alternatives such as [jetstack/cert-manager](https://github.com/jetstack/cert-manager)

We will be investing into creating something that resolves all issues of cert-controller soon.

@kangclzjc
Copy link
Contributor Author

I definitely think cert-manager support is a good idea. But let's just keep it simple. Optional CA/Secet helm values should do the trick.

It's shame that cert-controller isn't seeing much investment... Looks like another approach is to use kube-webhook-certgen in helmchart preinstall/postinstall scripts. Seems to work well for the Prometheus community.

kube-webhook-certgen seems not be updated for 4 years. I guess maybe we can add cert-manager support first, at least give customer a basic way to manage certs. What do you think @gflarity ?

@gflarity
Copy link
Contributor

gflarity commented Oct 4, 2025

@kangclzjc yes, we should definitely support external certificate management. What did you think about the approach I mentioned? https://www.elastic.co/guide/en/cloud-on-k8s/2.12/k8s-webhook.html.

I'm happy to jump on a video chat this week to discuss moving forward. I'll ping you on slack.

@kangclzjc
Copy link
Contributor Author

@kangclzjc yes, we should definitely support external certificate management. What did you think about the approach I mentioned? https://www.elastic.co/guide/en/cloud-on-k8s/2.12/k8s-webhook.html.

I'm happy to jump on a video chat this week to discuss moving forward. I'll ping you on slack.

The approach you mentioned is great to me. And just as @unmarshall said, I can add k8s-webhook feature in other PR. It should be separated with this cert-manager PR.

@renormalize
Copy link
Collaborator

Totally agree with @gflarity's suggestion. We should simply allow for external management of these secrets. How the user achieves that, should not be our concern. The approach suggested in #145 (comment) looks good to me. We should emulate that.

@kangclzjc
Copy link
Contributor Author

kangclzjc commented Oct 24, 2025

Totally agree with @gflarity's suggestion. We should simply allow for external management of these secrets. How the user achieves that, should not be our concern. The approach suggested in #145 (comment) looks good to me. We should emulate that.

yeah, that's good. Based on this, I draft a PR, after PR's ready, please help to review this PR. In this PR, we can mount
webhook-cert-dir or leverage existing certificate which is provided by cert-manager.

@kangclzjc kangclzjc closed this Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants