Skip to content

Commit b16febc

Browse files
committed
Doc updates for JDBC connections.
1 parent cebd6cc commit b16febc

File tree

1 file changed

+53
-0
lines changed

1 file changed

+53
-0
lines changed

README.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ The output reports are written in [Markdown](https://www.markdownguide.org/). I
6868
* [Scenario #1](#scenario-%231)
6969
- [Permissions](#permissions-1)
7070
- [Configuration](#configuration)
71+
* [JDBC Drivers and Configuration](#jdbc-drivers-and-configuration)
7172
* [Secure Passwords in Configuration](#secure-passwords-in-configuration)
7273
- [Tips for Running `hms-miror`](#tips-for-running-hms-miror)
7374
* [Run in `screen` or `tmux`](#run-in-screen-or-tmux)
@@ -901,6 +902,58 @@ You'll need JDBC driver jar files that are **specific* to the clusters you'll in
901902
902903
See the [running](#running-hms-mirror) section for examples on running `hms-mirror` for various environment types and connections.
903904
905+
### JDBC Drivers and Configuration
906+
907+
`hms-mirror` requires JDBC drivers to connect to the various end-points needed to perform it's tasks. The `LEFT` and `RIGHT` cluster endpoints for HiveServer2 require the standalone JDBC drivers that are specific to that Hive version.
908+
909+
`hms-mirror` supports the Apache and packaged hive **standalone** drivers that are found with your distribution. For CDP, we also support to Cloudera JDBC driver found and maintained at on the [Cloudera Hive JDBC Downloads Page](https://www.cloudera.com/downloads/connectors/hive/jdbc). Note that the URL configurations between the Apache and Cloudera JDBC drivers are different.
910+
911+
Starting with the Apache Standalone driver shipped with CDP 7.1.8 cummulative hot fix parcels, you will need to include additional jars in the configuration `jarFile` configuration, due to some packaging adjustments.
912+
913+
For example: `jarFile: "<cdp_parcel_jars>/hive-jdbc-3.1.3000.7.1.8.28-1-standalone.jar:<cdp_parcel_jars>/log4j-1.2-api-2.18.0.jar:<cdp_parcel_jars>/log4j-api-2.18.0.jar:<cdp_parcel_jars>/log4j-core-2.18.0.jar"` NOTE: The jar file with the Hive Driver MUST be the first in the list of jar files.
914+
915+
The Cloudera JDBC driver shouldn't require additional jars.
916+
917+
##### Example URLs
918+
919+
###### **CDP Hive via Knox Gateway**
920+
921+
Doesn't require Kerberos. Knox is SSL, so depending on whether you've self-signed your certs you may need to make adjustments.
922+
923+
- Apache Hive and CDP Packaged Apache Hive JDBC Driver
924+
> `jdbc:hive2://s03.streever.local:8443/;ssl=true;transportMode=http;httpPath=gateway/cdp-proxy-api/hive;sslTrustStore=/Users/dstreev/bin/certs/gateway-client-trust.jks;trustStorePassword=changeit`
925+
926+
- Cloudera JDBC Driver
927+
> `jdbc:hive2://s03.streever.local:8443;transportMode=http;AuthMech=3;httpPath=gateway/cdp-proxy-api/hive;SSL=1;AllowSelfSignedCerts=1`
928+
929+
###### **CDP Hive direct with Kerberos**
930+
931+
When connecting to via Kerberos, place the JDBC driver for Hive in the `$HOME/.hms-mirror/aux_libs` directory. This ensures it is loaded in the classpath correctly to support the kerberos connection. If you're NOT using Kerberos connection, the libraries should be reference in the `jarFile` parameter and NOT be in the `aux_libs` directory.
932+
933+
If you are experimenting with different connection types, ensure the jar file is REMOVED from `aux_libs` when trying other configurations.
934+
935+
- Apache JDBC Driver (packaged with CDP)
936+
> `tbd`
937+
> NOTE: Our experience with this driver is that requires the use of `--hadoop-classpath` in the commandline to load the needed kerberos libraries.
938+
939+
940+
- Cloudera JDBC Driver
941+
> `jdbc:hive2://s04.streever.local:10001;transportMode=http;AuthMech=1;KrbRealm=STREEVER.LOCAL;KrbHostFQDN=s04.streever.local;KrbServiceName=hive;KrbAuthType=2;httpPath=cliservice;SSL=1;AllowSelfSignedCerts=1`
942+
> NOTE: When using this driver, our experience has been that you do NOT need to use `--hadoop-classpath` as a commandline element with versions 1.6.1.0+
943+
944+
###### **HDP2 HS2 with No Auth**
945+
946+
Since CDP is usually kerberized AND `hms-mirror` doesn't support the simultanous connections to 2 different kerberos environments, I've setup an HS2 on HDP2 specifically for this effort. NOTE: You need to specify a `username` when connecting to let Hive know what the user is. No password required.
947+
948+
- Apache Hive Standalone Driver shipped with HDP2.
949+
> `jdbc:hive2://k02.streever.local:10000`
950+
951+
952+
#### Direct Metastore DB Acceess for `-epl`
953+
954+
The `LEFT` and `RIGHT` configurations also suppport 'direct' metastore access to collect detailed partition information when using the flag `-epl`. The support this feature, get the JDBC driver that is appropriate for your metastore(s) backend dbs and place it in `$HOME/.hms-mirror/aux_libs` directory.
955+
956+
904957
### Secure Passwords in Configuration
905958
906959
There are two passwords stored in the configuration file mentioned above. One for each 'JDBC connection, if those rely on a password for connect. By default, the passwords are in clear text in the configuration file. This usually isn't an issue since the file can be protected at the UNIX level from peering eyes. But if you need to protect those passwords, `hms-mirror` supports storing an encrypted version of the password in the configuration.

0 commit comments

Comments
 (0)