Skip to content

Commit

Permalink
cache: add timeout for groupcache's fetch operation (thanos-io#5206)
Browse files Browse the repository at this point in the history
* cache: add timeout for groupcache's fetch operation

Add a timeout for groupcache's fetch operation. It is useful when there
are network errors - if loading from a peer fails then we still might
have a chance to be able to load data from remote object storage
ourselves.

Signed-off-by: Giedrius Statkevičius <[email protected]>

* CHANGELOG: add entry

Signed-off-by: Giedrius Statkevičius <[email protected]>

* cache: add yaml tag for new field

Signed-off-by: Giedrius Statkevičius <[email protected]>

* cache: bump default timeout, improve docs

Signed-off-by: Giedrius Statkevičius <[email protected]>

* docs: make changes according to Matej's suggestions

Signed-off-by: Giedrius Statkevičius <[email protected]>
  • Loading branch information
GiedriusS committed May 11, 2022
1 parent 7684ae9 commit 4b292cd
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 0 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
## Performance

### Added
=======

- [#5205](https://github.com/thanos-io/thanos/pull/5205) Rule: Add ruler labels as external labels in stateless ruler mode.
- [#5206](https://github.com/thanos-io/thanos/pull/5206) Cache: add timeout for groupcache's fetch operation

- [#4290](https://github.com/thanos-io/thanos/pull/4290) proxy: coalesce multiple requests for the same data; greatly improves performance when opening a dashboard without query-frontend where there are a lot of different panels (queries) asking for the same data

Expand Down
3 changes: 3 additions & 0 deletions docs/components/store.md
Original file line number Diff line number Diff line change
Expand Up @@ -429,6 +429,7 @@ config:
- http://10.123.22.100:8080
groupcache_group: test_group
dns_interval: 1s
timeout: 2s
```

In this case, three Thanos Store nodes are running in the same group meaning that they all point to the same remote object storage.
Expand All @@ -441,6 +442,8 @@ In the `peers` section it is possible to use the prefix form to automatically lo

Note that there must be no trailing slash in the `peers` configuration i.e. one of the strings must be identical to `self_url` and others should have the same form. Without this, loading data from peers may fail.

If timeout is set to zero then there is no timeout for fetching and fetching's lifetime is equal to the lifetime to the original request's lifetime. It is recommended to keep it higher than zero. It is generally preferred to keep this value higher because the fetching operation potentially includes loading of data from remote object storage.

## Index Header

In order to query series inside blocks from object storage, Store Gateway has to know certain initial info from each block index. In order to achieve so, on startup the Gateway builds an `index-header` for each block and stores it on local disk; such `index-header` is build by downloading specific pieces of original block's index, stored on local disk and then mmaped and used by Store Gateway.
Expand Down
12 changes: 12 additions & 0 deletions pkg/cache/groupcache.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ type Groupcache struct {
galaxy *galaxycache.Galaxy
universe *galaxycache.Universe
logger log.Logger
timeout time.Duration
}

// GroupcacheConfig holds the in-memory cache config.
Expand All @@ -59,13 +60,17 @@ type GroupcacheConfig struct {

// How often we should resolve the addresses.
DNSInterval time.Duration `yaml:"dns_interval"`

// Timeout specifies the read/write timeout.
Timeout time.Duration `yaml:"timeout"`
}

var (
DefaultGroupcacheConfig = GroupcacheConfig{
MaxSize: 250 * 1024 * 1024,
DNSSDResolver: dns.GolangResolverType,
DNSInterval: 1 * time.Minute,
Timeout: 2 * time.Second,
}
)

Expand Down Expand Up @@ -255,6 +260,7 @@ func NewGroupcacheWithConfig(logger log.Logger, reg prometheus.Registerer, conf
logger: logger,
galaxy: galaxy,
universe: universe,
timeout: conf.Timeout,
}, nil
}

Expand All @@ -265,6 +271,12 @@ func (c *Groupcache) Store(ctx context.Context, data map[string][]byte, ttl time
func (c *Groupcache) Fetch(ctx context.Context, keys []string) map[string][]byte {
data := map[string][]byte{}

if c.timeout != 0 {
timeoutCtx, cancel := context.WithTimeout(ctx, c.timeout)
ctx = timeoutCtx
defer cancel()
}

for _, k := range keys {
codec := galaxycache.ByteCodec{}

Expand Down

0 comments on commit 4b292cd

Please sign in to comment.