trino + hive + minio with postgres in docker compose
This is a personal exercise for me to understand Trino and its architecture, especially with Hive Metastore/ Postgres/ Minio. The idea of the exercise is originated from https://trino.io/blog/2020/10/20/intro-to-hive-connector.html. The only difference is that I changed the DB of Hive Metastore from MariaDB to PostgresDB and tried to implement it with k8s with the help of https://github.com/alexcpn/presto_in_kubernetes.
The implementation with docker-compose is referred to https://github.com/bitsondatadev/trino-getting-started/tree/main/hive/trino-minio with the help of https://github.com/bitsondatadev/hive-metastore/pull/2/commits to build a Hive Metastore image to support PostgresDB.
Step 0 - Go to the docker-compose directory
cd docker-compose
Step 1 - Build a Hive Metastore image with the Dockerfile to support PostgresDB
docker build -t my-hive-metastore .
Step 2 - Implement with docker-compose
docker-compose up -d
Step 3 - Create Bucket in MinIO
- account: minio
- password: minio123
Step 4 - Into the runnung trino container
docker container exec -it docker-compose_trino-coordinator_1 trino
Step 5 - Create schema and table and play around with trino, you can see the trino dashboard from localhost:8080.
CREATE SCHEMA minio.test
WITH (location = 's3a://test/');
CREATE TABLE minio.test.customer
WITH (
format = 'ORC',
external_location = 's3a://test/customer/'
)
AS SELECT * FROM tpch.tiny.customer;
Step 6 - Into the running postgres container
docker exec -it "docker-compose_postgres_1" psql -U admin -d "hive_db"
Step 7 - Run SQL commands on postgresDB to see where the metadata is stored.
SELECT
"DB_ID",
"DB_LOCATION_URI",
"NAME",
"OWNER_NAME",
"OWNER_TYPE",
"CTLG_NAME"
FROM "DBS";
For more metadata detail, kindly check: https://github.com/bitsondatadev/trino-getting-started/tree/main/hive/trino-minio
Step 8 - Close down the running containers
docker-compose down