Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No retry mechanism the cert fetch #1131

Open
1 task done
duffney opened this issue Oct 17, 2023 · 5 comments · May be fixed by #1625
Open
1 task done

No retry mechanism the cert fetch #1131

duffney opened this issue Oct 17, 2023 · 5 comments · May be fixed by #1625
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@duffney
Copy link
Contributor

duffney commented Oct 17, 2023

What happened in your environment?

I installed Ratify on my AKS cluster before the appropriate access policy was assigned to the managed identity used by Ratify to connect to an Azure Key Vault instance. Once I noticed the issue, I assigned the appropriate access policy to the identity, but without a retry for the certificate fetch I was forced to uninstall and reinstall Ratify on the cluster. @akashsinghal mentioned it's also possible to delete the certstore-akv certificatestore to force a new fetch to occur.

What did you expect to happen?

I expected that Ratify would retry the fetch and resolve the cert issues.

What version of Kubernetes are you running?

1.26.3

What version of Ratify are you running?

v1.0.0

Anything else you would like to add?

No response

Are you willing to submit PRs to contribute to this bug fix?

  • Yes, I am willing to implement it.

Steps to reproduce the error

  1. Clone https://github.com/duffney/secure-supply-chain-on-aks
  2. Run cd terraform, terraform init && terraform apply --auto-approve
  3. Set environment vars from terraform output
terraform init && terraform apply --auto-approve;
export GROUP_NAME="$(terraform output -raw rg_name)"
export AKS_NAME="$(terraform output -raw aks_name)"
export VAULT_URI="$(terraform output -raw akv_uri)"
export KEYVAULT_NAME="$(terraform output -raw akv_name)"
export ACR_NAME="$(terraform output -raw acr_name)"
export CERT_NAME="$(terraform output -raw cert_name)"
export TENANT_ID="$(terraform output -raw tenant_id)"
export CLIENT_ID="$(terraform output -raw wl_client_id)"
  1. Get AKS creds az aks get-credentials --resource-group ${GROUP_NAME} --name ${AKS_NAME}
  2. Deploy Gatekeeper
helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts

helm install gatekeeper/gatekeeper  \
--name-template=gatekeeper \
--namespace gatekeeper-system --create-namespace \
--set enableExternalData=true \
--set validatingWebhookTimeoutSeconds=5 \
--set mutatingWebhookTimeoutSeconds=2
  1. Deploy Ratify
helm repo add ratify https://deislabs.github.io/ratify

helm install ratify \
    ratify/ratify --atomic \
    --namespace gatekeeper-system \
    --set akvCertConfig.enabled=true \
    --set featureFlags.RATIFY_CERT_ROTATION=true \
    --set akvCertConfig.vaultURI=${VAULT_URI} \
    --set akvCertConfig.cert1Name=${CERT_NAME} \
    --set akvCertConfig.tenantId=${TENANT_ID} \
    --set oras.authProviders.azureWorkloadIdentityEnabled=true \
    --set azureWorkloadIdentity.clientId=${CLIENT_ID}
  1. Deploy template and constraint
kubectl apply -f  manifests/template.yaml
kubectl apply -f  manifests/constraint.yaml
  1. Deploy the manifests
kubectl apply -f /manifests
  1. Describe certificatestore
kubectl describe certificatestore certstore-akv --namespace gatekeeper-system
@duffney duffney added bug Something isn't working triage Needs investigation labels Oct 17, 2023
@akashsinghal
Copy link
Collaborator

cc: @susanshi

@susanshi
Copy link
Collaborator

susanshi commented Oct 24, 2023

Hi @duffney Josh, The certificate store reconcile is only triggered when the CR is modified. In the case of permission change, since the action is external to ratify, customer would need to manually trigger a fetch operation by deleting and applying the CR again. I will add a TSG for this work around , thank you for the submitting this feedback.

It is also an option to implement to scheduled sync, we will need to discuss if this setting should be configurable, and the if certificate should be evited if fetch operation fails.

@yizha1 yizha1 removed the triage Needs investigation label Oct 24, 2023
@yizha1 yizha1 added this to the Future milestone Oct 24, 2023
@susanshi
Copy link
Collaborator

TSG PR submitted for review :ratify-project/ratify-web#28

@yizha1
Copy link
Collaborator

yizha1 commented Mar 18, 2024

@susanshi @akashsinghal I would like to discuss this issue in the community meeting this week.

@yizha1 yizha1 added enhancement New feature or request and removed bug Something isn't working labels Apr 30, 2024
@akashsinghal akashsinghal modified the milestones: Future, v1.3.0 May 16, 2024
@susanshi
Copy link
Collaborator

susanshi commented May 23, 2024

Hi @duffney , the secrets store csi driver supports auto rotation, we could see if there are anything we could reference from their design.

@duffney duffney linked a pull request Jul 10, 2024 that will close this issue
19 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants