Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating Ports with correctDrift enabled using multiple-paths repo triggers an error #2609

Open
mmartin24 opened this issue Jul 5, 2024 · 6 comments

Comments

@mmartin24
Copy link
Collaborator

mmartin24 commented Jul 5, 2024

Issue

Error triggered after updating ports with correctDrift enabled while using multiple-paths repo

Reproduction steps

  • Install Rancher 2.8-head with 3 downstream clusters
  • Created a GitRepo by enabling correctDrift. (https://github.com/rancher/fleet-test-data, path: multiple-paths) and deployed to all downstream clusers
  • Navigated to DS cluster --> Services
  • Edited service by updating it's port.
  • Waited for 30 seconds.
  • Observations on Rancher 2.8.5:
  • GitRepo is in Modified state with Error(See screenshot.)
    Image
  • Navigated to Continuous Delivery --> Clusters.
  • DS cluster on which service was deployed/created is in Modified state.
  • Above observation is seen in Rancher 2.8.5 + 2.8-head but not in 2.9.0-alpha7.
    Edit: Observed in 2.9 as well IF Force update is applied

This check has been currently added to ui/e2e ci here. A video can be downloaded from artifact and can by playing the part p1.specs.ts on minute 8:02

@mmartin24
Copy link
Collaborator Author

Not sure if the Modified state is somehow related to rancher/dashboard#11404

@mmartin24
Copy link
Collaborator Author

FTR: After adding latest checks for this issue on our ci, 2.8-head did not show this error today, but 2.7-head did.

@kkaempf kkaempf added this to the v2.9.0 milestone Jul 8, 2024
@mmartin24 mmartin24 changed the title Updating Ports with correctDritft enabled using multiple-paths repo triggers an error Updating Ports with correctDrift enabled using multiple-paths repo triggers an error Jul 8, 2024
@manno manno modified the milestones: v2.9.0, v2.7-Next1 Jul 9, 2024
@mmartin24
Copy link
Collaborator Author

I just performed a manual check and it seems there was a caveat in our automation.
We removed the click on force update from our tests. For some reason in 2.7 this was still ok and spotted the issue, but in 2.9 it would remain unnoticed until clicked. After doing this the issue appeared:

Image

I will keep investigating tomorrow morning

@mmartin24 mmartin24 modified the milestones: v2.7-Next1, v2.9-Next1 Jul 10, 2024
@manno
Copy link
Member

manno commented Jul 10, 2024

Seems related to how helm upgrades services for us. See https://github.com/kubernetes/kubernetes/issues/105610

fleet-agent should store the error in the bundledeployments status

@mmartin24
Copy link
Collaborator Author

Adding error logs

{
  "level": "error",
  "ts": "2024-07-10T09:02:36Z",
  "logger": "bundledeployment",
  "msg": "Failed to deploy bundle",
  "controller": "bundledeployment",
  "controllerGroup": "fleet.cattle.io",
  "controllerKind": "BundleDeployment",
  "BundleDeployment": {
    "name": "test-drift-multiple-paths-service",
    "namespace": "cluster-fleet-default-imported-2-945038eba7ea"
  },
  "namespace": "cluster-fleet-default-imported-2-945038eba7ea",
  "name": "test-drift-multiple-paths-service",
  "reconcileID": "308f9b4f-fb0f-48ca-a273-2e475b183966",
  "status": {
    "conditions": [
      {
        "type": "Installed",
        "status": "True",
        "lastUpdateTime": "2024-07-10T09:01:20Z"
      },
      {
        "type": "Deployed",
        "status": "False",
        "lastUpdateTime": "2024-07-10T09:02:36Z",
        "reason": "Error",
        "message": "cannot patch \"mp-app-service\" with kind Service: Service \"mp-app-service\" is invalid: spec.ports[1].name: Duplicate value: \"required-name2\""
      },
      {
        "type": "Ready",
        "status": "True",
        "lastUpdateTime": "2024-07-10T09:02:36Z"
      },
      {
        "type": "Monitored",
        "status": "True",
        "lastUpdateTime": "2024-07-10T09:01:20Z"
      }
    ],
    "appliedDeploymentID": "s-e900fb60b86d8593e95a733a0c0d1794f2d71a00910f794d19bcd4d57deca:aa73273923fd2b194b95dc51be330a7b1be92dafa689e0afb400abda8b37d8c0",
    "release": "test-fleet-mp-service/test-drift-multiple-paths-service:1",
    "ready": true,
    "nonModified": true,
    "display": {
      "deployed": "Error: cannot patch \"mp-app-service\" with kind Service: Service \"mp-app-service\" is invalid: spec.ports[1].name: Duplicate value: \"required-name2\"",
      "monitored": "True",
      "state": "Ready"
    },
    "syncGeneration": 0
  },
  "error": "cannot patch \"mp-app-service\" with kind Service: Service \"mp-app-service\" is invalid: spec.ports[1].name: Duplicate value: \"required-name2\"",
  "errorVerbose": "cannot patch \"mp-app-service\" with kind Service: Service \"mp-app-service\" is invalid: spec.ports[1].name: Duplicate value: \"required-name2\"\nhelm.sh/helm/v3/pkg/kube.(*Client).Update\n\t/home/runner/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/kube/client.go:438\nhelm.sh/helm/v3/pkg/action.(*Install).performInstall\n\t/home/runner/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/action/install.go:456\nhelm.sh/helm/v3/pkg/action.(*Install).performInstallCtx.func1\n\t/home/runner/go/pkg/mod/github.com/rancher/helm/[email protected]/pkg/action/install.go:421\nruntime.goexit\n\t/home/runner/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695",
  "stacktrace": "github.com/rancher/fleet/internal/cmd/agent/controller.(*BundleDeploymentReconciler).Reconcile\n\t/home/runner/work/fleet/fleet/internal/cmd/agent/controller/bundledeployment_controller.go:129\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:261\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222"
}
{
  "level": "error",
  "ts": "2024-07-10T09:02:36Z",
  "msg": "Reconciler error",
  "controller": "bundledeployment",
  "controllerGroup": "fleet.cattle.io",
  "controllerKind": "BundleDeployment",
  "BundleDeployment": {
    "name": "test-drift-multiple-paths-service",
    "namespace": "cluster-fleet-default-imported-2-945038eba7ea"
  },
  "namespace": "cluster-fleet-default-imported-2-945038eba7ea",
  "name": "test-drift-multiple-paths-service",
  "reconcileID": "308f9b4f-fb0f-48ca-a273-2e475b183966",
  "error": "failed deploying bundle: cannot patch \"mp-app-service\" with kind Service: Service \"mp-app-service\" is invalid: spec.ports[1].name: Duplicate value: \"required-name2\"",
  "errorCauses": [
    {
      "error": "failed deploying bundle: cannot patch \"mp-app-service\" with kind Service: Service \"mp-app-service\" is invalid: spec.ports[1].name: Duplicate value: \"required-name2\""
    }
  ],
  "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:261\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222"
}

@weyfonk
Copy link
Contributor

weyfonk commented Sep 11, 2024

Additional QA

Problem

When failing to correct drift on a resource (eg. modified ports array on a service), Fleet would leave a GitRepo in Modified state, with no error on the corresponding bundle deployment status.

Solution

  • The drift correction error is now reflected in the bundle deployment status, which should in turn propagate it to the GitRepo status
  • Setting force: true on the GitRepo resolves the error by deleting and recreating the Helm release for the bundle deployment, hence recreating the service in this case.

Testing

See reproduction steps above, in the issue description.

Engineering Testing

Manual Testing

  1. Created a GitRepo with drift correction enabled (but not set to force mode) pointing to rancher/fleet-test-data's multiple-paths
  2. Edited the created service ports
  3. Checked status of the GitRepo and bundle deployment
  4. Updated the GitRepo drift correction mode to true
  5. Saw the GitRepo and bundle deployment status error disappear, once the service had been recreated.

Automated Testing

  • Integration tests cover propagation of the drift correction error to the bundle deployment status
  • End-to-end tests verify that when a bundle is marked as modified, patching a GitRepo to set its correctDrift.force option to true eventually updates the bundle status, in that the bundle will no longer appear as modified.

QA Testing Considerations

  • Test how the GitRepo status is reflected in the Rancher UI

Regressions Considerations

N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Needs QA review
Development

No branches or pull requests

4 participants