Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rollout support to weaver kube #32

Merged
merged 3 commits into from
Jul 24, 2023
Merged

add rollout support to weaver kube #32

merged 3 commits into from
Jul 24, 2023

Conversation

rgrandl
Copy link
Collaborator

@rgrandl rgrandl commented Jul 20, 2023

In the current implementation, with weaver kube you have to pause the world when you want to rollout a new version of the app.

In this PR we enable rolling upgrades, a deployment strategy that is supported by Kubernetes.

In the near future, we might want to provide integration with argo rollouts (or something else) to enable blue/green and canary deployments.

Main changes:

  • for each component that exports a listener, we create a deployment with an unique name; this doesn't change across rollouts, so a rolling update will update the existing pods one by one
  • one a pod is updated to the new version, whenever it gets a request, it will only talk with other services within the same version
  • for each public listener, the name should be unique and not deployment specific
  • also, for jaeger and prometheus, we want unique instances that persist across multiple versions (and captures the metrics/traces for all the deployer versins of an app)
  • Prometheus used to scrape an address that was passed to the pods to export metrics on; this doesn't really work across versions; so we change the prometheus config to scrape all pods based on a label (the label is based on the app name, so it scrapes all pods across all versions of the app)

Drawbacks of this approach (things that might not work):

  • what if a listener moves between components
  • what if a listener is dropped

For now, in all our examples we have a simple listener exported by main. So we can restrict to be only one listener exported by main, and this might cover most of the scenarios. However, how to handle listeners across versions is something to think about.

Copy link
Contributor

@spetrovic77 spetrovic77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice set of changes!

internal/impl/kube.go Show resolved Hide resolved
@@ -716,6 +740,10 @@ func buildContainer(dockerImage string, rs *replicaSetInfo, dep *protos.Deployme
// should be only for a short period of time.
},

// Expose the metrics port from the container, so it can be discoverable for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, I believe this is redundant, but a good practice anyway. You can open any local port and have it be accessible to the Service, Prometheus, etc., without declaring it here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work if I remove it ...

aname := name{dep.App.Name, "hpa", rs.name, dep.Id[:8]}.DNSLabel()
depName := name{dep.App.Name, rs.name, dep.Id[:8]}.DNSLabel()

var depName string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this code into a function, e.g.:

func deploymentName(rs *replicaSetInfo) string {...}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

var serviceType string
labels := map[string]string{}
if lis.IsPublic {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that if the listener is non-public, we should do the same thing, i.e., use a non-versioned name.

The way you have it in buildAutoscaler should work here I believe.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. Done

},
// Number of old ReplicaSets to retain to allow rollback.
RevisionHistoryLimit: ptrOf(int32(1)),
MinReadySeconds: int32(5),
},
}, nil
}

// buildService generates a kubernetes service for a replica set.
func buildService(rs *replicaSetInfo, dep *protos.Deployment) (*corev1.Service, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename this to buildInternalService, e.g.:

// buildInternalService builds a Kubernetes service that is used for
// internal communication between the weavelets.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -514,7 +536,24 @@ scrape_configs:

// buildDeployment generates a kubernetes deployment for a replica set.
func buildDeployment(rs *replicaSetInfo, dep *protos.Deployment, image string) (*v1.Deployment, error) {
name := name{dep.App.Name, rs.name, dep.Id[:8]}.DNSLabel()
// Unique name that persists across app versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too confusing because we don't need a globalName if the deployment doesn't host any listeners. And if it does host listeners, the depName is incorrect below. (Hence the use of oname.)

Can you do it like so, with comments:

matchLabels := map[string]string{}
podLabels := map[string]string{"metrics": dep.App.Name}
var depName string
if hasListenersFn(rs) {
depName = name{dep.App.Name, rs.name}.DNSLabel()
// set labels
} else {
depName = name{dep.App.Name, rs.name, dep.Id[:8]}.DNSLabel()
// set labels
}

// use depName below instead of oname

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some changes and I think it addresses your comment as well.

podLabels := map[string]string{"depName": depName, "metrics": dep.App.Name}
if hasListenersFn(rs) {
oname = globalName
matchLabels["globalName"] = globalName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused about how an internal Service will talk to this Deployment, since we don't set the "depName" for it (and the internal Service matches based on "depName").

Could you try to run an example where "main" and some other component Foo are co-located, and a third component talks to the Foo component? I'm not sure if we have any such example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add labels for deployment and global to the pod labels. We already do that, so i think it works? I added a TODO to test with main though, because we don't have an app that does that.

In the current implementation, with weaver kube you have to pause the
world when you want to rollout a new version of the app.

In this PR we enable rolling upgrades, a deployment strategy that is
supported by Kubernetes.

In the near future, we might want to provide integration with argo
rollouts (or something else) to enable blue/green and canary
deployments.

Main changes:
* for each component that exports a listener, we create a deployment
  with an unique name; this doesn't change across rollouts, so a rolling
  update will update the existing pods one by one
* one a pod is updated to the new version, whenever it gets a request,
  it will only talk with other services within the same version
* for each public listener, the name should be unique and not deployment
  specific
* also, for jaeger and prometheus, we want unique instances that persist
  across multiple versions (and captures the metrics/traces for all the
  deployer versins of an app)
* Prometheus used to scrape an address that was passed to the pods to
  export metrics on; this doesn't really work across versions; so we
  change the prometheus config to scrape all pods based on a label (the
  label is based on the app name, so it scrapes all pods across all
  versions of the app)

Drawbacks of this approach (things that might not work):
* what if a listener moves between components
* what if a listener is dropped

For now, in all our examples we have a simple listener exported by main.
So we can restrict to be only one listener exported by main, and this
might cover most of the scenarios. However, how to handle listeners
across versions is something to think about.
@rgrandl rgrandl merged commit ad30589 into main Jul 24, 2023
8 of 9 checks passed
@rgrandl rgrandl deleted the add_argo_rollouts branch July 24, 2023 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants