Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create 2 services for brokers: a headless service with publishNotReadyAddresses and an ordinary clusterip service for lookups #437

Open
lhotari opened this issue Jan 19, 2024 · 2 comments

Comments

@lhotari
Copy link
Member

lhotari commented Jan 19, 2024

Is your feature request related to a problem? Please describe.

Pulsar brokers use Zookeeper for publishing the membership to other brokers.
The Pulsar load manager will detect a new broker and assign new namespace bundles to the new broker.
One of the gaps with Pulsar and k8s is that the broker registers itself to Zookeeper and the broker pod
cannot be accessed before the broker pod's readiness probe check has passed and k8s marks the pod ready
and makes it available in DNS. There's also additional delay from DNS caches. This delay will cause a short service disruption for all topics belonging to the namespace bundles that have been assigned to the new broker.

For the brokers, there's also other reasons: the headless service has "publishNotReadyAddresses: true" and other service (which would be used for lookups) doesn't, so that lookups go only to healthy brokers.

Headless broker services with a lot of brokers could cause issues with Pulsar clients that are running on Alpine <3.18. k8s docs have something about this too: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-issues .

"If you are using Alpine version 3.17 or earlier as your base image, DNS may not work properly due to a design issue with Alpine. Until musl version 1.24 didn't include TCP fallback to the DNS stub resolver meaning any DNS call above 512 bytes would fail. Please upgrade your images to Alpine version 3.18 or above."

Since headless services return all hosts in the DNS response, it could exceed 512 bytes.

Describe the solution you'd like

The way to mitigate this is to have a separate ClusterIP service to be used for lookups and have a headless service with
"publishNotReadyAddresses: true". This is a common way to address the problem. One example of this is the
Bitnami Zookeeper Helm chart, https://github.com/bitnami/charts/tree/main/bitnami/zookeeper/templates
there's svc.yaml and svc-headless.yaml

"StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods."
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#limitations
Since brokers use a statefulset, the headless service cannot be removed.

The DNS names of the 2 services will be different. Since clients don't need to know the direct DNS names of the brokers, it's better to change the current headless service to be the ClusterIP service and add a new headless service for handling the stable names of the brokers. Admin scripts that connect directly to individual brokers might break, but that's could be made clear in the release notes when this feature is released.

@lhotari
Copy link
Member Author

lhotari commented Sep 3, 2024

It's possible that making a change to use podManagementPolicy: OrderedReady for the broker statefulset could be a partial mitigation to this issue until this is resolved. #474 adds a solution where podManagementPolicy: OrderedReady is used when the function worker is enabled in the broker. it's not currently possible to configure to use this explicitly in the Helm chart without modifications.

@lhotari
Copy link
Member Author

lhotari commented Sep 3, 2024

the podManagementPolicy workaround is implemented in #525

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant