Skip to content

Commit b6406f3

Browse files
authored
Merge branch 'main' into feat/rel-notes-secret-lifetime
2 parents ef0ff4b + 9fda7aa commit b6406f3

File tree

13 files changed

+187
-39
lines changed

13 files changed

+187
-39
lines changed
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
<!--
2+
DO NOT REMOVE THIS COMMENT. It is intended for people who might copy/paste from the previous release issue.
3+
This was created by an PR template: https://github.com/stackabletech/documentation/tree/main/.github/PULL_REQUEST_TEMPLATE/release-notes.md.
4+
-->
5+
6+
<!-- Release placeholders YY.M.X should be replaced. -->
7+
# Release Notes for SDP YY.M.X
8+
9+
> [!CAUTION]
10+
> Please assign the applicable `scheduled-for/YY.M.X` label.
11+
12+
> [!TIP]
13+
> - Use the commented out template headings in [release-notes][template].
14+
> - Begin each sentence on a new line. This helps with review suggestions and diffing.
15+
> - Use xrefs for links to other parts of the documentation so that they remain valid across versions.
16+
17+
[template]: https://github.com/stackabletech/documentation/blob/8dc93f28ac6d20a587f54d0a697c71fe47e8643a/modules/ROOT/pages/release-notes.adoc?plain=1#L11-L56
18+
19+
```[tasklist]
20+
#### Release note compilation tasks
21+
- [ ] Check [Issues](https://github.com/search?q=org%3Astackabletech+label%3Arelease-note%2Crelease-note%2Faction-required+label%3Arelease%YY.M.X%2Cscheduled-for%YY.M.X&type=issues) for Product and Platform release notes
22+
- [ ] Check [PRs](https://github.com/search?q=org%3Astackabletech+label%3Arelease-note%2Crelease-note%2Faction-required+label%3Arelease%YY.M.X%2Cscheduled-for%YY.M.X&type=pullrequests) for Product and Platform release notes
23+
- [ ] Optionally check the [Changelogs](https://github.com/search?q=org%3Astackabletech+path%3A*CHANGELOG.md+%22YY.M.X%22&type=code) in case release notes were missed
24+
- [ ] Compile list of new product versions that are supported and compile a list of new product features to include in the Release Highlights
25+
- [ ] Upgrade guide: Document how to use stackablectl to uninstall all and install new release
26+
- [ ] Upgrade guide: Document how to use helm to uninstall all and install new release
27+
- [ ] Upgrade guide: Every breaking change of all our operators
28+
- [ ] Upgrade guide: List removed product versions (if there are any)
29+
- [ ] Upgrade guide: List removed operators (if there are any)
30+
- [ ] Upgrade guide: List supported Kubernetes versions
31+
```
32+
33+
Each of the following tasks focuses on a specific goal and should be done once the items above have been completed.
34+
35+
```[tasklist]
36+
#### Release note review tasks
37+
- [ ] Check overall document structure
38+
- [ ] Check spelling, grammar, and correct wording
39+
- [ ] Check that internal links are xrefs
40+
- [ ] Check that rendered links are valid
41+
- [ ] Check that each sentence begins on a new line
42+
```

.github/workflows/build.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@ name: Build site
22

33
on:
44
pull_request:
5+
paths-ignore:
6+
- .github/PULL_REQUEST_TEMPLATES/**
7+
- .github/ISSUE_TEMPLATES/**
8+
- scripts/**
59

610
jobs:
711

.markdownlint.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,9 @@ MD013:
1818
MD024:
1919
# Only check sibling headings
2020
siblings_only: true
21+
22+
# MD032/blanks-around-lists
23+
MD032: false
24+
25+
# MD028/no-blanks-blockquote
26+
MD028: false

modules/ROOT/pages/quickstart.adoc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= Quickstart
2-
:latest-release: https://github.com/stackabletech/stackable-cockpit/releases/tag/stackablectl-1.0.0-rc2
2+
:latest-release: https://github.com/stackabletech/stackable-cockpit/releases/tag/stackablectl-24.11.1
33
:cockpit-releases: https://github.com/stackabletech/stackable-cockpit/releases
44
:description: Quickstart guide for Stackable: Install stackablectl, set up a demo, and connect to services like Superset and Trino with easy commands and links.
55

@@ -17,9 +17,9 @@ rename the file to `stackablectl`. You can also use the following command:
1717

1818
[source,console]
1919
----
20-
wget -O stackablectl https://github.com/stackabletech/stackable-cockpit/releases/download/stackablectl-1.0.0-rc2/stackablectl-x86_64-unknown-linux-gnu
20+
wget -O stackablectl https://github.com/stackabletech/stackable-cockpit/releases/download/stackablectl-24.11.1/stackablectl-x86_64-unknown-linux-gnu
2121
# or
22-
curl -L -o stackablectl https://github.com/stackabletech/stackable-cockpit/releases/download/stackablectl-1.0.0-rc2/stackablectl-x86_64-unknown-linux-gnu
22+
curl -L -o stackablectl https://github.com/stackabletech/stackable-cockpit/releases/download/stackablectl-24.11.1/stackablectl-x86_64-unknown-linux-gnu
2323
----
2424

2525
Mark the binary as executable:

modules/concepts/nav.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
*** xref:operations/pod_disruptions.adoc[]
2121
*** xref:operations/pod_placement.adoc[]
2222
*** xref:operations/graceful_shutdown.adoc[]
23+
*** xref:operations/temporary_credentials_lifetime.adoc[]
2324
** Observability
2425
*** xref:labels.adoc[]
2526
*** xref:logging.adoc[]

modules/concepts/pages/index.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ It also includes xref:tls-server-verification.adoc[].
3030
== Operations
3131

3232
The xref:operations/index.adoc[operations] section is directed at platform maintainers.
33-
It covers xref:operations/cluster_operations.adoc[starting, stopping and restarts] of products, xref:operations/graceful_shutdown.adoc[] and other topics related to maintenance and ensuring stability of the platform operation.
33+
It covers xref:operations/cluster_operations.adoc[starting, stopping and restarts] of products, xref:operations/graceful_shutdown.adoc[] and other topics related to maintenance and ensuring stability and availability of the platform operation.
3434

3535
== Observability
3636

modules/concepts/pages/operations/cluster_operations.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,4 +123,4 @@ You can add more labels to make finer grained restarts.
123123
== Automatic restarts
124124

125125
The Commons Operator of the Stackable Platform may restart Pods automatically, for purposes such as ensuring that TLS certificates are up-to-date.
126-
For details, see the xref:commons-operator:index.adoc[Commons Operator documentation].
126+
For details, see xref:operations/temporary_credentials_lifetime.adoc[] as well as the xref:commons-operator:index.adoc[Commons Operator documentation].

modules/concepts/pages/operations/index.adoc

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,21 +11,23 @@ Make sure to go through the following checklist to achieve the maximum level of
1111
1. Make setup highly available (HA): In case the product supports running in an HA fashion, our operators will automatically configure it for you.
1212
You only need to make sure that you deploy a sufficient number of replicas.
1313
Please note that some products don't support HA.
14-
2. Reduce the number of simultaneous pod disruptions (unavailable replicas).
14+
2. Reduce the number of simultaneous pod disruptions (unavailable replicas):
1515
The Stackable operators write defaults based upon knowledge about the fault tolerance of the product, which should cover most of the use-cases.
1616
For details have a look at xref:operations/pod_disruptions.adoc[].
1717
3. Reduce impact of pod disruptions:
1818
Many HA capable products offer a way to gracefully shut down the service running within the Pod.
1919
The flow is as follows: Kubernetes wants to shut down the Pod and calls a hook into the Pod, which in turn interacts with the product, signaling it to gracefully shut down.
2020
The final deletion of the Pod is then blocked until the product has successfully migrated running workloads away from the Pod that is to be shut down.
2121
Details covering the graceful shutdown mechanism are described in xref:operations/graceful_shutdown.adoc[] as well as the actual operator documentation.
22-
+
23-
WARNING: Graceful shutdown is not implemented for all products yet. Please check the documentation specific to the product operator to see if it is supported (such as e.g. xref:trino:usage-guide/operations/graceful-shutdown.adoc[the documentation for Trino].
24-
2522
4. Spread workload across multiple Kubernetes nodes, racks, datacenter rooms or datacenters to guarantee availability
2623
in the case of e.g. power outages or fire in parts of the datacenter. All of this is supported by
2724
configuring an https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/[antiAffinity] as documented in
2825
xref:operations/pod_placement.adoc[]
26+
5. Reduce the frequency of disruptions:
27+
Although we try our best to reduce the impact of disruptions, some tools simply don't support HA setups.
28+
One example is the Trino coordinator - if you restart it, all running queries will fail.
29+
Many products use temporary credentials (such as TLS certificates), which have a short lifetime by default for maximum security.
30+
The xref:operations/temporary_credentials_lifetime.adoc[] page describes how you can increase the lifetime of this temporary credentials too avoid frequent restarts.
2931

3032
== Maintenance actions
3133

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
= Temporary credentials lifetime
2+
:description: Customize the lifetime of temporary credentials.
3+
4+
== Usages
5+
6+
=== TLS certificates
7+
8+
Currently the only temporary credentials are TLS certificates.
9+
10+
Many products use TLS to secure the communications, often times customers use the xref:secret-operator:secretclass.adoc#backend-autotls[secret-operator autoTls] backend to create TLS certificates for the Pods on the fly.
11+
To increase security, these temporary credentials have a short lifetime by default, which will result in e.g. Trino coordinator Pods restarting every ~24 hours (minus some safety buffer) to avoid using expired certificates.
12+
13+
== Configure the lifetime
14+
15+
In high load production environments, restarting Pods can be a costly operation, as it can disrupt services and in some cases even lead to data loss.
16+
To avoid frequent restarts, the lifetime of all temporary credentials (such as the TLS certificates) can be increased as needed.
17+
18+
Here is an example for configuring the temporary credentials lifetime to 7 days in a HDFS stacklet.
19+
It should result in the HDFS Pods restarting weekly instead of daily:
20+
21+
[source,yaml]
22+
----
23+
---
24+
apiVersion: hdfs.stackable.tech/v1alpha1
25+
kind: HdfsCluster
26+
metadata:
27+
name: hdfs
28+
spec:
29+
nameNodes:
30+
config:
31+
requestedSecretLifetime: 7d # <1>
32+
roleGroups:
33+
default:
34+
replicas: 2
35+
dataNodes:
36+
config:
37+
requestedSecretLifetime: 7d # <2>
38+
roleGroups:
39+
default:
40+
replicas: 2
41+
journalNodes:
42+
roleGroups:
43+
default:
44+
replicas: 3
45+
config:
46+
requestedSecretLifetime: 7d # <3>
47+
----
48+
<1> The lifetime of the TLS certificates for *all* NameNode roleGroups is set to 7 days.
49+
<2> The lifetime of the TLS certificates for *all* DataNode roleGroups is set to 7 days.
50+
<3> The lifetime of the TLS certificates for the `default` JournalNode group is set to 7 days.
51+
52+
NOTE: The configuration for the JournalNodes is done at roleGroup level for demonstration purposes.
53+
54+
=== TLS certificate lifetimes
55+
56+
Even though operators allow setting this property to a value of your choice, the xref:secret-operator:index.adoc[secret-operator] will not exceed the `maxCertificateLifetime` value specified in SecretClass creating the TLS certificates (see xref:secret-operator/secretclass.adoc#certificate_lifetime).
57+
58+
=== Operator support
59+
60+
Similar to the example above, users can configure the lifetime of temporary credentials for the following operators:
61+
62+
* Apache Druid
63+
* Apache Hadoop
64+
* Apache HBase
65+
* Apache NiFi
66+
* Apache Spark
67+
* Apache Zookeeper
68+
* Trino
69+
70+
== Pod lifetime annotations
71+
72+
After configuring the lifetime as described above you could simply observe your stacklet for a week/month (or whatever your new lifetime is), to see if your changes take effect.
73+
However, it's much quicker to check at what point of time your Pods will be restarted next.
74+
75+
Pods are not restarted "randomly" by Stackable operators, but in a predicable manner.
76+
When a temporary credential is added to a Pod, an annotation is added as well.
77+
It starts with `restarter.stackable.tech/expires-at.` and instructs the xref:commons-operator:restarter.adoc[restart-controller] to restart the Pod once the specified point in time is reached.
78+
79+
Given the following Pod
80+
81+
[source,yaml]
82+
----
83+
kind: Pod
84+
metadata:
85+
annotations:
86+
restarter.stackable.tech/expires-at.b887492af14bfe84f10cb2ff1b60acb0: "2024-12-05T14:03:47.131570189+00:00"
87+
restarter.stackable.tech/expires-at.ea77192c1184326d33e8ee32cfe921ea: "2024-12-05T15:49:10.043722965+00:00"
88+
----
89+
90+
You can always determine the instant the Pod will be restarted by the xref:commons-operator:restarter.adoc[restart-controller] by taking the earliest timestamp, `2024-12-05T14:03:47.131570189+00:00` in this case.
91+
92+
You can use this timestamp to check if your changes have been applied as intended.

modules/contributor/pages/adr/ADR029-database-connection.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ v0.1, 2022-12-08
1515
1616
Technical Story: https://github.com/stackabletech/issues/issues/238
1717

18+
NOTE: We might want to incorporate changes to address https://github.com/stackabletech/issues/issues/681, maybe as V2?
19+
1820
== Context and Problem Statement
1921

2022
Many products supported by the Stackable Data Platform require databases to store metadata. Currently there is no uniform, consistent way to define database connections. In addition, some Stackable operators define database credentials to be provided inline and in plain text in the cluster definitions.

0 commit comments

Comments
 (0)