Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check demos and upgrade from 24.3 to dev release #59

Closed
13 tasks done
Tracked by #557
xeniape opened this issue Jul 15, 2024 · 14 comments
Closed
13 tasks done
Tracked by #557

Check demos and upgrade from 24.3 to dev release #59

xeniape opened this issue Jul 15, 2024 · 14 comments
Assignees

Comments

@xeniape
Copy link
Member

xeniape commented Jul 15, 2024

Description

For each demo: test the upgrading process from 24.3 to dev release and document necessary changes for upgrading if necessary. Also check the steps in the demo documentation to ensure the proper functionality of the demo with the dev release.

Tasks

  1. release/2024-07
    xeniape
@xeniape xeniape self-assigned this Jul 15, 2024
@xeniape
Copy link
Member Author

xeniape commented Jul 16, 2024

airflow-scheduled-job

  • Found nothing breaking.
  • airflow-webserver restarted two times due to OOM, but not transparent when this exactly happened, maybe during some upgrades and phases where pods wheren't available. Mostly ran under the requested memory limit. No jobs failing and no effects on the demo other than short unavailability during restart probably.

Ran following steps for testing upgrade:

# install demo
stackablectl demo install airflow-scheduled-job

# add helm repos for upgrades
helm repo add bitnami https://charts.bitnami.com/bitnami

# upgrade postgresql and redis versions
helm upgrade postgresql-airflow bitnami/postgresql --version 15.5.16
helm upgrade redis-airflow bitnami/redis --version 19.6.1

# uninstall operators
stackablectl release uninstall 24.3

# update crds
kubectl replace -f https://raw.githubusercontent.com/stackabletech/airflow-operator/main/deploy/helm/airflow-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/spark-k8s-operator/main/deploy/helm/spark-k8s-operator/crds/crds.yaml

# install dev version of operators
stackablectl operator install commons listener secret airflow spark-k8s

# edit 'productVersion' for AirflowCluster and SparkApplication
kubectl edit airflowclusters/airflow # change version to 2.9.2
kubectl edit sparkapplications/pyspark-pi-20240716092248 # change version to 3.5.1

And then went through the demo steps in https://docs.stackable.tech/home/nightly/demos/airflow-scheduled-job

@xeniape xeniape transferred this issue from stackabletech/issues Jul 16, 2024
@xeniape
Copy link
Member Author

xeniape commented Jul 16, 2024

hbase-hdfs-load-cycling-data

  • Found nothing breaking.

Ran following steps for testing upgrade:

# install demo
stackablectl demo install hbase-hdfs-load-cycling-data

# uninstall operators
stackablectl release uninstall 24.3

# update crds
kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/hbase-operator/main/deploy/helm/hbase-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/hdfs-operator/main/deploy/helm/hdfs-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/main/deploy/helm/zookeeper-operator/crds/crds.yaml

# install dev version of operators
stackablectl operator install commons listener secret hbase hdfs zookeeper

# edit 'productVersion' for HBaseCluster, HdfsCluster, and ZookeeperCluster
kubectl edit hbaseclusters/hbase # change version to 2.4.18
kubectl edit hdfsclusters/hdfs # change version to 3.4.0

And then went through the demo steps in https://docs.stackable.tech/home/nightly/demos/hbase-hdfs-load-cycling-data

@NickLarsenNZ NickLarsenNZ changed the title Check demos for dev release Check demos and upgrade from 23.4 to dev release Jul 16, 2024
@xeniape
Copy link
Member Author

xeniape commented Jul 16, 2024

nifi-kafka-druid-earthquake-data

Ran following steps for testing upgrade:

# install demo
stackablectl demo install nifi-kafka-druid-earthquake-data

# add helm repos for upgrades
helm repo add minio https://charts.min.io/
helm repo add bitnami https://charts.bitnami.com/bitnami

# upgrade postgresql and redis versions
helm upgrade minio minio/minio --version 5.2.0
helm upgrade postgresql-druid bitnami/postgresql --version 15.5.16
helm upgrade postgresql-superset bitnami/postgresql --version 15.5.16

# uninstall operators
stackablectl release uninstall 24.3

# update crds
kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/druid-operator/main/deploy/helm/druid-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/kafka-operator/main/deploy/helm/kafka-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/nifi-operator/main/deploy/helm/nifi-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/superset-operator/main/deploy/helm/superset-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/main/deploy/helm/zookeeper-operator/crds/crds.yaml

# install dev version of operators
stackablectl operator install commons listener secret druid kafka nifi superset zookeeper

# edit 'productVersion' for products
# kubectl patch druidclusters/druid --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"30.0.0"}]' -> skipped due to errors mentioned above
kubectl patch kafkaclusters/kafka --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"3.7.1"}]'
kubectl patch nificlusters/nifi --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"1.27.0"}]'
kubectl patch supersetclusters/superset --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"4.0.2"}]'

# breaking changes

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: druid-db-credentials
stringData:
  username: druid
  password: druid
EOF

kubectl patch druidclusters/druid --type='json' -p='[{"op": "replace", "path": "/spec/clusterConfig/metadataStorageDatabase/credentialsSecret", "value":"druid-db-credentials"}]'

And then went through the demo steps in https://docs.stackable.tech/home/stable/demos/nifi-kafka-druid-earthquake-data

nifi-kafka-druid-water-level-data demo

This demo has exactly the same steps as the nifi-kafka-druid-earthquake-data demo, except for the installation step

# install demo
stackablectl demo install nifi-kafka-druid-water-level-data

And then went through the demo steps in https://docs.stackable.tech/home/nightly/demos/nifi-kafka-druid-water-level-data

@soenkeliebau soenkeliebau changed the title Check demos and upgrade from 23.4 to dev release Check demos and upgrade from 24.3 to dev release Jul 17, 2024
@xeniape
Copy link
Member Author

xeniape commented Jul 17, 2024

I created this overview for me for visualization of what operators, products and additional tools are used where and which version to upgrade to. Maybe it helps out someone else as well (it's not complete or anything, just a work in progress to help me out having an overview) -> it looks better in obsidian (where I created it) but just added it here to share

Maybe it's a little bit better here: https://app.nuclino.com/Stackable/Engineering/Release-247-61295855-9e7a-49aa-bd90-3453c7a7a6e8

Operators

Demo commons secret listener airflow druid hbase hdfs hive kafka nifi opa spark-k8s superset trino zookeeper
airflow-scheduled-job x x x x x
hbase-hdfs-load-cycling-data x x x x x x
end-to-end-security x x x x x x x x x x
nifi-kafka-druid-earthquake-data x x x x x x x x
nifi-kafka-druid-water-level-data x x x x x x x x
spark-k8s-anomaly-detection-taxi-data x x x x x x x x
trino-iceberg x x x x x x
trino-taxi-data x x x x x x x
data-lakehouse-iceberg-trino-spark x x x x x x x x x x x
jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data x x x x x x
logging x x x x
signal-processing x x x x x

Products

Demo airflow druid hbase phoenix omid hdfs hive kafka nifi spark superset trino zookeeper opa vector
versions 2.9.2 30.0.0 2.4.18 5.2.0 1.1.2 3.4.0 3.1.3 3.7.1 1.27.0 3.5.1 4.0.2 451 3.9.2 0.66.0 0.39.0
airflow-scheduled-job x x
hbase-hdfs-load-cycling-data x x x x
end-to-end-security x x x x x x
nifi-kafka-druid-earthquake-data x x x x x
nifi-kafka-druid-water-level-data x x x x x
spark-k8s-anomaly-detection-taxi-data x x x x
trino-iceberg x x x
trino-taxi-data x x x x
data-lakehouse-iceberg-trino-spark x x x x x x x
jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data x x
logging x x
signal-processing x x

Additional Tools

Demo postgres redis krb5 keycloak minio jupyterhub opensearch opensearch-dashboards grafana timescale
versions 15.5.16 19.6.1 5.2.0
airflow-scheduled-job x x
hbase-hdfs-load-cycling-data
end-to-end-security x x x
nifi-kafka-druid-earthquake-data x x
nifi-kafka-druid-water-level-data x x
spark-k8s-anomaly-detection-taxi-data x x
trino-iceberg x x
trino-taxi-data x x
data-lakehouse-iceberg-trino-spark x x
jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data x
logging x x
signal-processing x x x

@xeniape
Copy link
Member Author

xeniape commented Jul 17, 2024

jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data

  • hdfs-operator throws an error while trying to upgrade the hdfs version in the statefulset https://stackable-workspace.slack.com/archives/C031A56R127/p1721208024421789 -> the solution for now was deleting the hdfs statefulsets
  • While the new hdfs namenode pod is starting it's throwing an error because of an old layout version and goes into a crashloopbackoff state -> Upgrades of hdfs are currently not supported Support [rolling] upgrade of HDFS hdfs-operator#362 -> for now restarting the demo with hdfs already on version 3.4.0
  • The remaining parts of the demo worked fine. Just a deprecation warning I noticed in the jupyterhub playbook:
    Image

Ran following steps for testing upgrade:

# install demo
stackablectl demo install jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data

# add helm repos for upgrades
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/

# upgrade jupyterhub version
helm upgrade jupyterhub jupyterhub/jupyterhub --version 3.3.7

# uninstall operators
stackablectl release uninstall 24.3

# update crds
kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/hdfs-operator/main/deploy/helm/hdfs-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/spark-k8s-operator/main/deploy/helm/spark-k8s-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/main/deploy/helm/zookeeper-operator/crds/crds.yaml

# install dev version of operators
stackablectl operator install commons listener secret hdfs spark-k8s zookeeper

# edit 'productVersion' for products
# kubectl patch hdfsclusters/hdfs --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"3.4.0"}]'
# skipped because of upgrading issues, see list above. Deployed version 3.4.0 directly by modifying the stacks-v2.yaml file and hdfs version locally in the repo

And then went through the demo steps in https://docs.stackable.tech/home/nightly/demos/jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data

@Techassi
Copy link
Member

Techassi commented Jul 18, 2024

🟢 signal-processing

  • The demo uses a custom Docker image which didn't include config-utils because the base image used (nifi-1.25.0) is too old and doesn't include the new tool.
  • Rebuild and pushed the image using NiFi 1.27.0 fixed the issue.

I ran following steps for testing the upgrade:

# install demo
stackablectl demo install signal-processing

# uninstall operators
stackablectl release uninstall 24.3

# update crds
kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/main/deploy/helm/zookeeper-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/nifi-operator/main/deploy/helm/nifi-operator/crds/crds.yaml

# install dev version of operators
stackablectl operator install commons listener secret zookeeper nifi

# edit 'productVersion' for NifiCluster
kubectl edit hdfsclusters/hdfs # change version to 1.27.0

I went through the demo steps in https://docs.stackable.tech/home/stable/demos/signal-processing before the upgrade, and wanted to validate it afterwards.

Required changes: f62f996 (#60)

@siegfriedweber
Copy link
Member

siegfriedweber commented Jul 18, 2024

logging

Ran following steps for testing upgrade:

# install demo
stackablectl demo install logging

# add helm repos for upgrades
helm repo add vector https://helm.vector.dev

# upgrade Vector versions
helm upgrade vector-aggregator vector/vector --version 0.34.0

# uninstall operators
stackablectl release uninstall 24.3

# copy the secret `secret-provisioner-tls-ca` to the operator's namespace
kubectl get secrets secret-provisioner-tls-ca --output=yaml | \
    sed 's/namespace: .*/namespace: stackable-operators/' | \
    kubectl create --filename=-

# update crds
kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/main/deploy/helm/zookeeper-operator/crds/crds.yaml

# install dev version of operators
stackablectl operator install commons listener secret zookeeper

And then went through the demo steps in https://docs.stackable.tech/home/nightly/demos/logging

@razvan
Copy link
Member

razvan commented Jul 18, 2024

🟢 spark-k8s-anomaly-detection-taxi-data

  • after the operator upgrade I had to delete the spark driver pod manually for the application to be re-run
  • I had problems logging into the MinIO interface (connection timeouts and resets) but it eventually worked.
  • logging into Superset, I could view the dashboard
# install demo
stackablectl demo install spark-k8s-anomaly-detection-taxi-data

# uninstall operators
stackablectl release uninstall 24.3

# update crds
kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/spark-k8s-operator/main/deploy/helm/spark-k8s-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/hive-operator/main/deploy/helm/hive-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/opa-operator/main/deploy/helm/opa-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/superset-operator/main/deploy/helm/superset-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/trino-operator/main/deploy/helm/trino-operator/crds/crds.yaml

stackablectl op in commons secret listener hive opa spark-k8s superset trino

@dervoeti
Copy link
Member

🟢 trino-iceberg

  • Everything worked fine, found one bug in the 24.3 docs but that was already fixed in the nightly version 🙂
# install demo
stackablectl demo install trino-iceberg

# upgrade postgres chart
helm upgrade postgresql-hive-iceberg bitnami/postgresql --version 15.5.16

# uninstall operators
stackablectl release uninstall 24.3

# update crds
kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/hive-operator/main/deploy/helm/hive-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/trino-operator/main/deploy/helm/trino-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/opa-operator/main/deploy/helm/opa-operator/crds/crds.yaml

# install dev version of operators
stackablectl operator install commons listener secret hive trino opa

# edit 'productVersion' for products
kubectl patch opaclusters/opa --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"0.66.0"}]'
kubectl patch trinoclusters/trino --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"451"}]'

@NickLarsenNZ
Copy link
Member

NickLarsenNZ commented Jul 19, 2024

🟢 data-lakehouse-iceberg-trino-spark

It basically worked, just some demo docs to update. See the outcome at the end.

# install demo
stackablectl demo install data-lakehouse-iceberg-trino-spark

# add helm repos for upgrades
helm repo add minio https://charts.min.io/
helm repo add bitnami https://charts.bitnami.com/bitnami

# upgrade postgresql and redis versions
helm upgrade minio minio/minio --version 5.2.0
helm upgrade postgresql-hive bitnami/postgresql --version 15.5.16
helm upgrade postgresql-hive-iceberg bitnami/postgresql --version 15.5.16
helm upgrade postgresql-superset bitnami/postgresql --version 15.5.16

# Postgres gave the following warnings:
# WARNING: There are "resources" sections in the chart not set. Using "resourcesPreset" is not recommended for production. For production installations, please set the following values according to your workload needs:
#     - primary.resources
#     - readReplicas.resources

# uninstall operators
stackablectl release uninstall 24.3

# copy the secret `secret-provisioner-tls-ca` to the operator's namespace
# Thanks @siegfriedweber. See: https://github.com/stackabletech/secret-operator/issues/453
kubectl -n default get secrets secret-provisioner-tls-ca --output=yaml | \
    sed 's/namespace: .*/namespace: stackable-operators/' | \
    kubectl create --filename=-

# update crds
kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/hive-operator/main/deploy/helm/hive-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/kafka-operator/main/deploy/helm/kafka-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/nifi-operator/main/deploy/helm/nifi-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/opa-operator/main/deploy/helm/opa-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/spark-k8s-operator/main/deploy/helm/spark-k8s-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/superset-operator/main/deploy/helm/superset-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/trino-operator/main/deploy/helm/trino-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/main/deploy/helm/zookeeper-operator/crds/crds.yaml

# install dev version of operators
stackablectl operator install commons listener secret hive kafka nifi opa spark-k8s superset trino zookeeper

# deployments and statefulsets rolled out

# edit 'productVersion' for products
kubectl patch opaclusters/opa --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"0.66.0"}]' # changed
kubectl patch zookeeperclusters/zookeeper --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"3.9.2"}]'
kubectl patch hiveclusters/hive --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"3.1.3"}]'
kubectl patch hiveclusters/hive-iceberg --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"3.1.3"}]'
kubectl patch kafkaclusters/kafka --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"3.7.1"}]'
kubectl patch nificlusters/nifi --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"1.27.0"}]'
kubectl patch trinoclusters/trino --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"451"}]' # changed
kubectl patch supersetclusters/superset --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"4.0.2"}]' # changed
# kubectl patch sparkapplications/spark-ingest-into-lakehouse --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"3.5.1"}]' # Was already done

Had scheduling issues, and scaled up the node pool to 12.

Then I ran the demo based on the instructions in the nightly docs (in case there were any updates since stable).

Outcome

Node pool

Pods with anti-affinity constraints that come up last (eg: bigger images to pull) do not get scheduled because there are no nodes left. In this case, Trino has 2 nodes that couldn't be placed (hence two extra nodes).

Outcome: Update the docs to require 12 nodes. We probably need to talk about prioritisation and preemption, as IMO, the services with anti-affinity constraints should have a higher priority than those without. Done in 914cedd

secret-operator

Need to copy the secret-provisioner-tls-ca from the default namespace to stackable-operators.

Outcome: This needs documenting in the release notes. Unless stackabletech/secret-operator#453 is completed first.

postgres

Got some warnings:

# Postgres gave the following warnings:
# WARNING: There are "resources" sections in the chart not set. Using "resourcesPreset" is not recommended for production. For production installations, please set the following values according to your workload needs:
#     - primary.resources
#     - readReplicas.resources

Outcome: Do nothing, but perhaps look at it later in more detail. See #62

minio

The text "You can see that Trino has placed a single file into the selected folder containing all the house sales of that particular year." doesn't match the image which shows 4 files.

Outcome: Update the docs to avoid mentioning how many files to expect. Done in 52eb86d

spark

No visualization information available for the streaming job run. but that seems to only be because the jobs completed. The active job show a visualisation, as does the completed water_level job.

Job list Missing visualization
Job list Missing visualization

Outcome: This doesn't block the release (the upgrade worked), but the demo does need looking at again. Fix it after the release. See #62

trino

The text tpch: "TPCH connector providing a set of schemas to support the TPC Benchmark™ DS" link points to TPCDS.

Outcome: Update the link to point to TPCH. Done in 9b5df59

superset

There is a table preview bug, and it is documented, but I only saw it later

Error Documented note about the bug
Image Image

Outcome: Upgrade the NOTE to a WARNING, and move it before the image. Done in 904148d

@maltesander
Copy link
Member

maltesander commented Jul 19, 2024

🟢 end-to-end-security

Install 24.3 demo:

stackablectl demo in end-to-end-security

Bump vector:

helm upgrade vector-aggregator vector/vector --version 0.34.0

Uninstall operators:

stackablectl release uninstall 24.3

CRD changes:

kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/hdfs-operator/main/deploy/helm/hdfs-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/hive-operator/main/deploy/helm/hive-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/opa-operator/main/deploy/helm/opa-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/trino-operator/main/deploy/helm/trino-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/superset-operator/main/deploy/helm/superset-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/main/deploy/helm/zookeeper-operator/crds/crds.yaml

Breaking secret changes:

# copy the secret `secret-provisioner-tls-ca` to the operator's namespace
kubectl get secrets secret-provisioner-tls-ca --output=yaml | \
    sed 's/namespace: .*/namespace: stackable-operators/' | \
    kubectl create --filename=-

Then bump the demo (product versions adapted in demos-v2 and stacks-v2.yaml, release switched to 0.0.0-dev):

stackablectl demo in end-to-end-security -d demos/demos-v2.yaml -s stacks/stacks-v2.yaml -r ../release/releases.yaml

Delete "create-tables-in-trino" job due to:

Caused by these errors (recent errors listed first):
 1: failed to install demo "end-to-end-security"
 2: failed to install stack manifests
 3: failed to deploy manifests using the kube client
 4: failed to patch/create Kubernetes object
 5: ApiError
 6: Job.batch "create-tables-in-trino" is invalid

All comes up again with bumped versions.
Trino: 442 -> 451
Superset: 3.1.0 -> 3.1.3
OPA 0.61.0 -> 0.66.0

@razvan
Copy link
Member

razvan commented Jul 19, 2024

🟢 trino-taxi-data

stackablectl demo install trino-taxi-data 

# uninstall operators
stackablectl release uninstall 24.3

# update crds
kubectl replace -f https://raw.githubusercontent.com/stackabletech/commons-operator/main/deploy/helm/commons-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/listener-operator/main/deploy/helm/listener-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/secret-operator/main/deploy/helm/secret-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/hive-operator/main/deploy/helm/hive-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/trino-operator/main/deploy/helm/trino-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/superset-operator/main/deploy/helm/superset-operator/crds/crds.yaml
kubectl replace -f https://raw.githubusercontent.com/stackabletech/opa-operator/main/deploy/helm/opa-operator/crds/crds.yaml


# install dev operators
stackablectl operator install commons secret listener hive trino superset opa

# edit 'productVersion' for products
kubectl patch opaclusters/opa --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"0.66.0"}]'
kubectl patch trinoclusters/trino --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"451"}]'
kubectl patch supersetclusters/superset --type='json' -p='[{"op": "replace", "path": "/spec/image/productVersion", "value":"4.0.2"}]'


@NickLarsenNZ
Copy link
Member

This issue is now just waiting on the demos PR to be merged. #60

@maltesander
Copy link
Member

All done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

7 participants