Integrate with existing observability systems #44

rgrandl · 2023-08-09T21:57:41Z

Before this PR, the Kube deployer will automatically start services to provide observability to the user. We start Prometheus, Jaeger, Loki, Promtail and Grafana.

While this is great for a simple deployment, in many production scenarios, the user doesn't want to start some of these services (or disable everything for testing/benchmarking performance). The user can run its own custom services that scale well and have complicated configs.

This PR enables the user to plugin custom observability services. However, note that we are opinionated in terms of which services they can use, namely Prometheus for metrics, Jaeger for traces, Loki/Promtail for logs and Grafana for nice visualization. We don't enable integration with other services yet.

Before this PR, the Kube deployer will automatically start services to provide observability to the user. We start Prometheus, Jaeger, Loki, Promtail and Grafana. While this is great for a simple deployment, in many production scenarios, the user doesn't want to start some of these services (or disable everything for testing/benchmarking performance). The user can run its own custom services that scale well and have complicated configs. This PR enables the user to plugin custom observability services. However, note that we are opinionated in terms of which services they can use, namely Prometheus for metrics, Jaeger for traces, Loki/Promtail for logs and Grafana for nice visualization. We don't enable integration with other services yet.

spetrovic77

This looks great Robert. Mainly comments that will make the code more understandable, since it's getting quite complicated and subtle.

internal/impl/kube.go

internal/impl/observability.go

spetrovic77 · 2023-08-10T16:16:24Z

internal/impl/observability.go

+
+// shouldGenerateKubeDeploymentInfo returns true iff a Kubernetes deployment info
+// should be generated for service.
+func shouldGenerateKubeDeploymentInfo(service string, cfg *KubeConfig) bool {


shouldGenerateServiceConfigs

internal/impl/kube.go

spetrovic77 · 2023-08-10T16:24:58Z

internal/impl/observability.go

+// [1] https://helm.sh/
+func generateInfoToExportTraces(dep *protos.Deployment, cfg *KubeConfig) ([]byte, error) {
+	// The user disabled exporting the traces, don't generate anything.
+	if !shouldGenerateKubeDeploymentInfo(exportTracesURL, cfg) {


This confused the life out of me, since exportTracesURL sounds like it represents an actual URL to the Jaeger endpont. Instead, it's the key in the Observability map.

Can you rename:

`s/exportTracesURL/tracesConfigKey/'

Same for metrics etc.

Actually, even though it's somewhat prone to error, I would be happier to just inline "trace_service" here since it would be clearer.

I am sorry for that. Yeah, the names are tricky and I've spent a lot of time renaming and renaming things.

I would prefer to keep the name under tracesConfigKey, metricsConfigsKey, etc, instead of inlining them if you don't mind, because if we do a small typo we will hate our life because some of these names are propagated everywhere.

I think at some point we should create a subdirectory observability, and put different services in different files. It's getting hairy and hard to follow in general.

spetrovic77 · 2023-08-10T16:28:40Z

internal/impl/observability.go

+// shouldGenerateKubeDeploymentInfo returns true iff a Kubernetes deployment info
+// should be generated for service.
+func shouldGenerateKubeDeploymentInfo(service string, cfg *KubeConfig) bool {
+	return cfg.Observability[service] == ""


This is confusing since you are checking for "" instead of "none". Could you perhaps just get rid of this method and do actual checks for "auto", "none", "", and "url" everywhere, with comments?

I added a constant auto = "" for checks where kube should generate kubernetes service configs and a constant disabled = "none" for checks when no info for a given observability service (neither configs nor service configs) should be generated. I hope this is better.

internal/impl/observability.go

rgrandl

Thanks Srdjan for your thorough review.

rgrandl · 2023-08-10T17:31:25Z

internal/impl/babysitter.go

+		traceExporter, err =
+			jaeger.New(jaeger.WithCollectorEndpoint(jaeger.WithEndpoint(cfg.TraceExporterService)))
+		if err != nil {
+			fmt.Fprintf(os.Stderr, "Unable to create trace exporter: %v\n", err)


rgrandl · 2023-08-10T17:41:40Z

internal/impl/kube.go

+	// Compute the URL of the export traces service.
+	var exportTracesURLInfo string
+	val := cfg.Observability[exportTracesURL]
+	exportTracesURLIsSet := val != "" && val != "none"


Definitely more elegant.Thanks

rgrandl · 2023-08-10T18:27:24Z

internal/impl/observability.go

+// implementations for any observability systems.
+
+const (
+	// The names of the observability services that interact with the application.


I added comments and renamed fields (e.g., jaegerImageName -> autoJagerImageName; jaegerCollectorPort -> defaultJaegerCollectorPort).

For the 'exportTracesURL' et all names I am not sure what's the right name. Also the names 'jaeger_service', etc. bothers me, but I coudln't come up with anything better. Also, I am not sure if it's the most intuitive by default to enable auto generated observability or should generate nothing and the user should explicitly set these variables to "auto" and their "external service name" instead. Maybe for now we leave it as it is, and rethink later, when people try to actually use this.

rgrandl · 2023-08-10T19:02:48Z

internal/impl/observability.go

+
+// generateObservabilityInfo generates deployment information needed by the app
+// to export metrics, logs, and traces.
+func generateObservabilityInfo(dep *protos.Deployment, cfg *KubeConfig) ([]byte, error) {


I call it info because info contains deployment info (e.g., kubernetes deployment and service manifest) and also config info (things that are needed to configure the running service). I renamed everything to configs like:

generatePrometheusConfigs -> to configure Prometheus service
generatePrometheusDeploymentConfigs -> to generate kubernetes deployment configs needed to run the Prometheus service

internal/impl/observability.go

rgrandl · 2023-08-10T19:13:18Z

internal/impl/observability.go

+// observability = {jaeger_service = "jaeger-all-in-one"}
+//
+// [1] https://helm.sh/
+func generateInfoToExportTraces(dep *protos.Deployment, cfg *KubeConfig) ([]byte, error) {


rgrandl · 2023-08-10T19:22:08Z

internal/impl/observability.go

+// [1] https://helm.sh/
+func generateInfoToExportTraces(dep *protos.Deployment, cfg *KubeConfig) ([]byte, error) {
+	// The user disabled exporting the traces, don't generate anything.
+	if !shouldGenerateKubeDeploymentInfo(exportTracesURL, cfg) {


I am sorry for that. Yeah, the names are tricky and I've spent a lot of time renaming and renaming things.

I would prefer to keep the name under tracesConfigKey, metricsConfigsKey, etc, instead of inlining them if you don't mind, because if we do a small typo we will hate our life because some of these names are propagated everywhere.

I think at some point we should create a subdirectory observability, and put different services in different files. It's getting hairy and hard to follow in general.

rgrandl · 2023-08-10T19:27:25Z

internal/impl/observability.go

+
+// shouldGenerateKubeDeploymentInfo returns true iff a Kubernetes deployment info
+// should be generated for service.
+func shouldGenerateKubeDeploymentInfo(service string, cfg *KubeConfig) bool {


rgrandl · 2023-08-10T19:37:23Z

internal/impl/observability.go

+// shouldGenerateKubeDeploymentInfo returns true iff a Kubernetes deployment info
+// should be generated for service.
+func shouldGenerateKubeDeploymentInfo(service string, cfg *KubeConfig) bool {
+	return cfg.Observability[service] == ""


I added a constant auto = "" for checks where kube should generate kubernetes service configs and a constant disabled = "none" for checks when no info for a given observability service (neither configs nor service configs) should be generated. I hope this is better.

spetrovic77

Looks great!

spetrovic77 · 2023-08-10T20:23:47Z

internal/impl/kube.go

+		ReplicaSet:        r.name,
+		ComponentsToStart: r.components,
+		InternalPort:      int32(r.internalPort),
+		TraceServiceUrl:   r.traceServiceURL,


nit: s/Url/URL/

TraceServiceUrl is the autogenerated proto field trace_service_url. Unless we define it as trace_serviceURL, I think protoc will convert URL to Url.

rgrandl

Thanks Srdjan!

rgrandl force-pushed the try_out branch 2 times, most recently from f95d786 to e8e7b83 Compare August 9, 2023 22:08

rgrandl force-pushed the try_out branch from e8e7b83 to 0765292 Compare August 9, 2023 22:14

rgrandl requested a review from spetrovic77 August 9, 2023 22:19

spetrovic77 reviewed Aug 10, 2023

View reviewed changes

Addressed Srdjan's comments

8739d79

rgrandl commented Aug 10, 2023

View reviewed changes

spetrovic77 approved these changes Aug 10, 2023

View reviewed changes

rgrandl commented Aug 10, 2023

View reviewed changes

rgrandl merged commit a9a65f3 into main Aug 10, 2023
8 checks passed

rgrandl deleted the try_out branch August 10, 2023 20:37

rgrandl mentioned this pull request Aug 10, 2023

Gate Prometheus and Jaegar behind flags. #25

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate with existing observability systems #44

Integrate with existing observability systems #44

rgrandl commented Aug 9, 2023

spetrovic77 left a comment

spetrovic77 Aug 10, 2023

rgrandl Aug 10, 2023

spetrovic77 Aug 10, 2023

rgrandl Aug 10, 2023

spetrovic77 Aug 10, 2023

rgrandl Aug 10, 2023

rgrandl left a comment

rgrandl Aug 10, 2023

rgrandl Aug 10, 2023

rgrandl Aug 10, 2023

rgrandl Aug 10, 2023

rgrandl Aug 10, 2023

rgrandl Aug 10, 2023

rgrandl Aug 10, 2023

rgrandl Aug 10, 2023

spetrovic77 left a comment

spetrovic77 Aug 10, 2023

rgrandl Aug 10, 2023

rgrandl left a comment

Integrate with existing observability systems #44

Integrate with existing observability systems #44

Conversation

rgrandl commented Aug 9, 2023

spetrovic77 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rgrandl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spetrovic77 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rgrandl left a comment

Choose a reason for hiding this comment