stackabletech · fhennig · Sep 18, 2024 · Sep 16, 2024
diff --git a/docs/modules/hive/pages/getting_started/first_steps.adoc b/docs/modules/hive/pages/getting_started/first_steps.adoc
@@ -1,8 +1,8 @@
 = First steps
+:description: Deploy and verify a Hive metastore cluster with PostgreSQL and MinIO. Follow our setup guide and ensure all pods are ready for operation.
 
-After going through the xref:getting_started/installation.adoc[] section and having installed all the operators, you
-will now deploy a Hive metastore cluster and it's dependencies. Afterwards you can
-<<_verify_that_it_works, verify that it works>>.
+After going through the xref:getting_started/installation.adoc[] section and having installed all the operators, you will now deploy a Hive metastore cluster and it's dependencies.
+Afterwards you can <<_verify_that_it_works, verify that it works>>.
 
 == Setup
 

diff --git a/docs/modules/hive/pages/getting_started/index.adoc b/docs/modules/hive/pages/getting_started/index.adoc
@@ -1,6 +1,8 @@
 = Getting started
+:description: Learn to set up Apache Hive with the Stackable Operator. Includes installation, dependencies, and creating a Hive metastore on Kubernetes.
 
-This guide will get you started with Apache Hive using the Stackable Operator. It will guide you through the installation of the operator, its dependencies and setting up your first Hive metastore instance.
+This guide will get you started with Apache Hive using the Stackable Operator.
+It will guide you through the installation of the operator, its dependencies and setting up your first Hive metastore instance.
 
 == Prerequisites
 

diff --git a/docs/modules/hive/pages/getting_started/installation.adoc b/docs/modules/hive/pages/getting_started/installation.adoc
@@ -1,16 +1,16 @@
 = Installation
+:description: Install Stackable Operator for Apache Hive with MinIO and PostgreSQL using stackablectl or Helm. Follow our guide for easy setup and configuration.
 
-On this page you will install the Stackable Operator for Apache Hive and all required dependencies. For the installation
-of the dependencies and operators you can use Helm or `stackablectl`.
+On this page you will install the Stackable Operator for Apache Hive and all required dependencies.
+For the installation of the dependencies and operators you can use Helm or `stackablectl`.
 
-The `stackablectl` command line tool is the recommended way to interact with operators and dependencies. Follow the
-xref:management:stackablectl:installation.adoc[installation steps] for your platform if you choose to work with
-`stackablectl`.
+The `stackablectl` command line tool is the recommended way to interact with operators and dependencies.
+Follow the xref:management:stackablectl:installation.adoc[installation steps] for your platform if you choose to work with `stackablectl`.
 
 == Dependencies
 
-First you need to install MinIO and PostgreSQL instances for the Hive metastore. PostgreSQL is required as a database
-for Hive's metadata, and MinIO will be used as a data store, which the Hive metastore also needs access to.
+First you need to install MinIO and PostgreSQL instances for the Hive metastore.
+PostgreSQL is required as a database for Hive's metadata, and MinIO will be used as a data store, which the Hive metastore also needs access to.
 
 There are two ways to install the dependencies:
 
@@ -21,9 +21,8 @@ WARNING: The dependency installations in this guide are only intended for testin
 
 === stackablectl
 
-`stackablectl` was designed to install Stackable components, but its xref:management:stackablectl:commands/stack.adoc[Stacks]
-feature can also be used to install arbitrary Helm Charts. You can install MinIO and PostgreSQL using the Stacks feature
-as follows, but a simpler method via Helm is shown <<Helm, below>>.
+`stackablectl` was designed to install Stackable components, but its xref:management:stackablectl:commands/stack.adoc[Stacks] feature can also be used to install arbitrary Helm Charts.
+You can install MinIO and PostgreSQL using the Stacks feature as follows, but a simpler method via Helm is shown <<Helm, below>>.
 
 [source,bash]
 ----
@@ -67,8 +66,8 @@ Now call `stackablectl` and reference those two files:
 include::example$getting_started/getting_started.sh[tag=stackablectl-install-minio-postgres-stack]
 ----
 
-This will install MinIO and PostgreSQL as defined in the Stacks, as well as the Operators. You can now skip the
-<<Stackable Operators>> step that follows next.
+This will install MinIO and PostgreSQL as defined in the Stacks, as well as the Operators.
+You can now skip the <<Stackable Operators>> step that follows next.
 
 TIP: Consult the xref:management:stackablectl:quickstart.adoc[Quickstart] to learn more about how to use `stackablectl`.
 
@@ -133,8 +132,8 @@ Then install the Stackable operators:
 include::example$getting_started/getting_started.sh[tag=helm-install-operators]
 ----
 
-Helm will deploy the operators in a Kubernetes Deployment and apply the CRDs for the Apache Hive service (as well as the
-CRDs for the required operators). You are now ready to deploy the Apache Hive metastore in Kubernetes.
+Helm will deploy the operators in a Kubernetes Deployment and apply the CRDs for the Apache Hive service (as well as the CRDs for the required operators).
+You are now ready to deploy the Apache Hive metastore in Kubernetes.
 
 == What's next
 

diff --git a/docs/modules/hive/pages/index.adoc b/docs/modules/hive/pages/index.adoc
@@ -1,5 +1,5 @@
 = Stackable Operator for Apache Hive
-:description: The Stackable Operator for Apache Hive is a Kubernetes operator that can manage Apache Hive metastores. Learn about its features, resources, dependencies and demos, and see the list of supported Hive versions.
+:description: Manage Apache Hive metastores on Kubernetes with the Stackable Operator. Integrates with Trino and Spark.
 :keywords: Stackable Operator, Hadoop, Apache Hive, Kubernetes, k8s, operator, engineer, big data, metadata, storage, query
 :hive: https://hive.apache.org
 :github: https://github.com/stackabletech/hive-operator/

diff --git a/docs/modules/hive/pages/required-external-components.adoc b/docs/modules/hive/pages/required-external-components.adoc
@@ -1,6 +1,8 @@
 = Required external components
+:description: Hive Metastore requires a SQL database. Supported options include MySQL, Postgres, Oracle, and MS SQL Server. Stackable Hive supports PostgreSQL by default.
 
-The Hive Metastore requires a backend SQL database. Supported databases and versions are:
+The Hive Metastore requires a backend SQL database.
+Supported databases and versions are:
 
 * MySQL 5.6.17 and above
 * Postgres 9.1.13 and above

diff --git a/docs/modules/hive/pages/usage-guide/configuration-environment-overrides.adoc b/docs/modules/hive/pages/usage-guide/configuration-environment-overrides.adoc
@@ -1,4 +1,5 @@
 = Configuration & environment overrides
+:description: Override Hive config properties and environment variables at role or role group levels. Customize hive-site.xml, security.properties, and environment vars.
 
 The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).
 
@@ -8,8 +9,8 @@ IMPORTANT: Overriding certain properties, which are set by the operator (such as
 
 For a role or role group, at the same level of `config`, you can specify: `configOverrides` for the following files:
 
-- `hive-site.xml`
-- `security.properties`
+* `hive-site.xml`
+* `security.properties`
 
 For example, if you want to set the `datanucleus.connectionPool.maxPoolSize` for the metastore to 20 adapt the `metastore` section of the cluster resource like so:
 

diff --git a/docs/modules/hive/pages/usage-guide/data-storage.adoc b/docs/modules/hive/pages/usage-guide/data-storage.adoc
@@ -1,4 +1,5 @@
 = Data storage backends
+:description: Hive supports metadata storage on S3 and HDFS. Configure S3 with S3Connection and HDFS with configMap in clusterConfig.
 
 Hive does not store data, only metadata. It can store metadata about data stored in various places. The Stackable Operator currently supports S3 and HFS.
 

diff --git a/docs/modules/hive/pages/usage-guide/database-driver.adoc b/docs/modules/hive/pages/usage-guide/database-driver.adoc
@@ -1,7 +1,8 @@
 = Database drivers
+:description: Learn to configure Apache Hive with MySQL using Helm, PVCs, and custom images. Includes steps for driver setup and Hive cluster creation.
 
 The Stackable product images for Apache Hive come with built-in support for using PostgreSQL as the metastore database.
-The MySQL driver is not shipped in our images due to licensing issues.
+The MySQL driver is not shipped in Stackable images due to licensing issues.
 To use another supported database it is necessary to make the relevant drivers available to Hive: this tutorial shows how this is done for MySQL.
 
 == Install the MySQL helm chart

diff --git a/docs/modules/hive/pages/usage-guide/derby-example.adoc b/docs/modules/hive/pages/usage-guide/derby-example.adoc
@@ -1,5 +1,5 @@
-
 = Derby example
+:description: Deploy a single-node Apache Hive Metastore with Derby or PostgreSQL. Includes setup for S3 integration and tips for database configuration.
 
 Please note that the version you need to specify is not only the version of Apache Hive which you want to roll out, but has to be amended with a Stackable version as shown.
 This Stackable version is the version of the underlying container image which is used to execute the processes.

diff --git a/docs/modules/hive/pages/usage-guide/index.adoc b/docs/modules/hive/pages/usage-guide/index.adoc
@@ -1,4 +1,6 @@
 = Usage guide
 :page-aliases: usage.adoc
 
-This Section will help you to use and configure the Stackable Operator for Apache Hive in various ways. You should already be familiar with how to set up a basic instance. Follow the xref:getting_started/index.adoc[] guide to learn how to set up a basic instance with all the required dependencies.
+This Section will help you to use and configure the Stackable Operator for Apache Hive in various ways.
+You should already be familiar with how to set up a basic instance.
+Follow the xref:getting_started/index.adoc[] guide to learn how to set up a basic instance with all the required dependencies.
diff --git a/docs/modules/hive/pages/usage-guide/listenerclass.adoc b/docs/modules/hive/pages/usage-guide/listenerclass.adoc
@@ -1,6 +1,7 @@
 = Service exposition with ListenerClasses
 
-Apache Hive offers an API. The Operator deploys a service called `<name>` (where `<name>` is the name of the HiveCluster) through which Hive can be reached.
+Apache Hive offers an API.
+The Operator deploys a service called `<name>` (where `<name>` is the name of the HiveCluster) through which Hive can be reached.
 
 This service can have three different types: `cluster-internal`, `external-unstable` and `external-stable`. Read more about the types in the xref:concepts:service-exposition.adoc[service exposition] documentation at platform level.
 

diff --git a/docs/modules/hive/pages/usage-guide/logging.adoc b/docs/modules/hive/pages/usage-guide/logging.adoc
@@ -1,7 +1,7 @@
 = Log aggregation
+:description: The logs can be forwarded to a Vector log aggregator by providing a discovery ConfigMap for the aggregator and by enabling the log agent.
 
-The logs can be forwarded to a Vector log aggregator by providing a discovery
-ConfigMap for the aggregator and by enabling the log agent:
+The logs can be forwarded to a Vector log aggregator by providing a discovery ConfigMap for the aggregator and by enabling the log agent:
 
 [source,yaml]
 ----
@@ -14,5 +14,4 @@ spec:
         enableVectorAgent: true
 ----
 
-Further information on how to configure logging, can be found in
-xref:concepts:logging.adoc[].
+Further information on how to configure logging, can be found in xref:concepts:logging.adoc[].
diff --git a/docs/modules/hive/pages/usage-guide/monitoring.adoc b/docs/modules/hive/pages/usage-guide/monitoring.adoc
@@ -1,4 +1,5 @@
 = Monitoring
+:description: The managed Hive instances are automatically configured to export Prometheus metrics.
 
-The managed Hive instances are automatically configured to export Prometheus metrics. See
-xref:operators:monitoring.adoc[] for more details.
+The managed Hive instances are automatically configured to export Prometheus metrics.
+See xref:operators:monitoring.adoc[] for more details.
diff --git a/docs/modules/hive/pages/usage-guide/resources.adoc b/docs/modules/hive/pages/usage-guide/resources.adoc
@@ -1,4 +1,5 @@
 = Resource requests
+:description: Set CPU and memory requests for Hive metastore in Kubernetes. Default values and customization options are provided for optimal resource management.
 
 include::home:concepts:stackable_resource_requests.adoc[]
 
@@ -27,7 +28,7 @@ metastore:
             memory: "512Mi"
 ----
 
-The operator may configure an additional container for log aggregation. This is done when log aggregation is configured as described in xref:concepts:logging.adoc[]. The resources for this container cannot be configured using the mechanism described above. Use xref:nightly@home:concepts:overrides.adoc#_pod_overrides[podOverrides] for this purpose.
+The operator may configure an additional container for log aggregation. This is done when log aggregation is configured as described in xref:concepts:logging.adoc[]. The resources for this container cannot be configured using the mechanism described above. Use xref:home:concepts:overrides.adoc#_pod_overrides[podOverrides] for this purpose.
 
 You can configure your own resource requests and limits by following the example above.
 

diff --git a/docs/modules/hive/pages/usage-guide/security.adoc b/docs/modules/hive/pages/usage-guide/security.adoc
@@ -1,4 +1,5 @@
 = Security
+:description: Secure Apache Hive with Kerberos authentication in Kubernetes. Configure Kerberos server, SecretClass, and access Hive securely with provided guides.
 
 == Authentication
 Currently, the only supported authentication mechanism is Kerberos, which is disabled by default.
@@ -17,7 +18,7 @@ The next step is to configure your HdfsCluster to use the newly created SecretCl
 Please make sure to use the SecretClass named `kerberos`. It is also necessary to configure 2 additional things in HDFS:
 
 * Define group mappings for users with `hadoop.user.group.static.mapping.overrides`
-* Tell HDFS that Hive is allowed to impersonate other users, i.e. Hive does not need any _direct_ access permissions for itself, but should be able to impersonate Hive users when accessing HDFS. This can be done by e.g. setting `hadoop.proxyuser.hive.users=*` and `hadoop.proxyuser.hive.hosts=*` to allow the user `hive`´to impersonate all other users.
+* Tell HDFS that Hive is allowed to impersonate other users, i.e. Hive does not need any _direct_ access permissions for itself, but should be able to impersonate Hive users when accessing HDFS. This can be done by e.g. setting `hadoop.proxyuser.hive.users=*` and `hadoop.proxyuser.hive.hosts=*` to allow the user `hive` to impersonate all other users.
 
 An example of the above can be found in this https://github.com/stackabletech/hive-operator/blob/main/tests/templates/kuttl/kerberos-hdfs/30-install-hdfs.yaml.j2[integration test].