Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(distributor): add experimental memberlist kvStore for ha_tracker #10054

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@
* [FEATURE] PromQL: Add experimental `info` function. Experimental functions are disabled by default, but can be enabled setting `-querier.promql-experimental-functions-enabled=true` in the query-frontend and querier. #9879
* [FEATURE] Distributor: Support promotion of OTel resource attributes to labels. #8271
* [FEATURE] Querier: Add experimental `double_exponential_smoothing` PromQL function. Experimental functions are disabled by default, but can be enabled by setting `-querier.promql-experimental-functions-enabled=true` in the query-frontend and querier. #9844
* [FEATURE] Distributor: Add experimental `memberlist` KV store for ha_tracker. You can enable it using the `-distributor.ha-tracker.kvstore.store` flag. You can configure Memberlist parameters via the `-memberlist-*` flags. #10054
* [ENHANCEMENT] Query Frontend: Return server-side `bytes_processed` statistics following Server-Timing format. #9645 #9985
* [ENHANCEMENT] mimirtool: Adds bearer token support for mimirtool's analyze ruler/prometheus commands. #9587
* [ENHANCEMENT] Ruler: Support `exclude_alerts` parameter in `<prometheus-http-prefix>/api/v1/rules` endpoint. #9300
Expand Down
2 changes: 2 additions & 0 deletions docs/sources/mimir/configure/about-versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,8 @@ The following features are currently experimental:
- `-distributor.otel-created-timestamp-zero-ingestion-enabled`
- Promote a certain set of OTel resource attributes to labels
- `-distributor.promote-otel-resource-attributes`
- Add experimental `memberlist` key-value store for ha_tracker. Note that this feature is `experimental`, as the upper limits of propagation times have not yet been validated. Additionally, cleanup operations have not yet been implemented for the memberlist entries.
- `-distributor.ha-tracker.kvstore.store`
- Hash ring
- Disabling ring heartbeat timeouts
- `-distributor.ring.heartbeat-timeout=0`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -798,9 +798,8 @@ ha_tracker:
# CLI flag: -distributor.ha-tracker.failover-timeout
[ha_tracker_failover_timeout: <duration> | default = 30s]

# Backend storage to use for the ring. Please be aware that memberlist is not
# supported by the HA tracker since gossip propagation is too slow for HA
# purposes.
# Backend storage to use for the ring. Note that memberlist support is
# experimental.
kvstore:
# Backend storage to use for the ring. Supported values are: consul, etcd,
# inmemory, memberlist, multi.
Expand Down
7 changes: 1 addition & 6 deletions pkg/distributor/ha_tracker.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ import (
var (
errNegativeUpdateTimeoutJitterMax = errors.New("HA tracker max update timeout jitter shouldn't be negative")
errInvalidFailoverTimeout = "HA Tracker failover timeout (%v) must be at least 1s greater than update timeout - max jitter (%v)"
errMemberlistUnsupported = errors.New("memberlist is not supported by the HA tracker since gossip propagation is too slow for HA purposes")
)

type haTrackerLimits interface {
Expand Down Expand Up @@ -152,7 +151,7 @@ type HATrackerConfig struct {
// more than this duration
FailoverTimeout time.Duration `yaml:"ha_tracker_failover_timeout" category:"advanced"`

KVStore kv.Config `yaml:"kvstore" doc:"description=Backend storage to use for the ring. Please be aware that memberlist is not supported by the HA tracker since gossip propagation is too slow for HA purposes."`
KVStore kv.Config `yaml:"kvstore" doc:"description=Backend storage to use for the ring. Note that memberlist support is experimental."`
}

// RegisterFlags adds the flags required to config this to the given FlagSet.
Expand Down Expand Up @@ -180,10 +179,6 @@ func (cfg *HATrackerConfig) Validate() error {
return fmt.Errorf(errInvalidFailoverTimeout, cfg.FailoverTimeout, minFailureTimeout)
}

if cfg.KVStore.Store == "memberlist" {
return errMemberlistUnsupported
}

return nil
}

Expand Down
4 changes: 2 additions & 2 deletions pkg/distributor/ha_tracker_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -430,15 +430,15 @@ func TestHATrackerConfig_Validate(t *testing.T) {
}(),
expectedErr: nil,
},
"should fail if KV backend is set to memberlist": {
"should pass if KV backend is set to memberlist": {
cfg: func() HATrackerConfig {
cfg := HATrackerConfig{}
flagext.DefaultValues(&cfg)
cfg.KVStore.Store = "memberlist"

return cfg
}(),
expectedErr: errMemberlistUnsupported,
expectedErr: nil,
},
}

Expand Down
2 changes: 2 additions & 0 deletions pkg/mimir/modules.go
Original file line number Diff line number Diff line change
Expand Up @@ -1023,6 +1023,7 @@ func (t *Mimir) initMemberlistKV() (services.Service, error) {
// Append to the list of codecs instead of overwriting the value to allow third parties to inject their own codecs.
t.Cfg.MemberlistKV.Codecs = append(t.Cfg.MemberlistKV.Codecs, ring.GetCodec())
t.Cfg.MemberlistKV.Codecs = append(t.Cfg.MemberlistKV.Codecs, ring.GetPartitionRingCodec())
t.Cfg.MemberlistKV.Codecs = append(t.Cfg.MemberlistKV.Codecs, distributor.GetReplicaDescCodec())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: Without adding the ReplicaDescCodec and initializing the singleton for HATrackerConfig.KVStore.MemberlistKV, it throws a panic error when using memberlist as a KV store.

Validated locally


dnsProviderReg := prometheus.WrapRegistererWithPrefix(
"cortex_",
Expand All @@ -1037,6 +1038,7 @@ func (t *Mimir) initMemberlistKV() (services.Service, error) {

// Update the config.
t.Cfg.Distributor.DistributorRing.Common.KVStore.MemberlistKV = t.MemberlistKV.GetMemberlistKV
t.Cfg.Distributor.HATrackerConfig.KVStore.MemberlistKV = t.MemberlistKV.GetMemberlistKV
t.Cfg.Ingester.IngesterRing.KVStore.MemberlistKV = t.MemberlistKV.GetMemberlistKV
t.Cfg.Ingester.IngesterPartitionRing.KVStore.MemberlistKV = t.MemberlistKV.GetMemberlistKV
t.Cfg.StoreGateway.ShardingRing.KVStore.MemberlistKV = t.MemberlistKV.GetMemberlistKV
Expand Down
1 change: 1 addition & 0 deletions pkg/mimir/modules_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,7 @@ func TestMultiKVSetup(t *testing.T) {

Distributor: func(t *testing.T, c Config) {
require.NotNil(t, c.Distributor.DistributorRing.Common.KVStore.Multi.ConfigProvider)
require.NotNil(t, c.Distributor.HATrackerConfig.KVStore.MemberlistKV)
require.NotNil(t, c.Ingester.IngesterRing.KVStore.Multi.ConfigProvider)
require.NotNil(t, c.Ingester.IngesterPartitionRing.KVStore.Multi.ConfigProvider)
},
Expand Down
Loading