Help with Cluster optimization for a 150 node Trino 457 cluster running on r6g.16xlarge #24817

soham-dasgupta · 2025-01-28T06:47:30Z

soham-dasgupta
Jan 28, 2025

Hi Team, I am looking for some help with optimizing our Trino(457) cluster running on emr-7.6.0. Our main use case is to read from Glue backed by s3 , transform and write it back to S3 using CTAS. I am trying to benchmark the below cluster setup by using a CTAS query that joins two tables on have 61 billion records and the other 58 billion records

Here is coordinator config -

coordinator=true
node-scheduler.include-coordinator=false
discovery.uri=https://ip-xxx-xxx-xxx-xx.ec2.internal:8080
http-server.threads.max=500
discovery-server.enabled=true
sink.max-buffer-size=32MB
query.max-memory=29687808MB
query.max-memory-per-node=213781997159B
query.max-history=1000
query.min-expire-age=30m
query.client.timeout=30m
query.stage-count-warning-threshold=100
query.max-stage-count=400
http-server.http.port=8000
http-server.log.path=/var/log/trino/http-request.log
http-server.log.max-size=67108864B
http-server.log.max-history=5
log.max-size=268435456B
jmx.rmiregistry.port = 9080
jmx.rmiserver.port = 9081
task.max-worker-threads = 192
http-server.https.enabled = True
optimizer.optimize-hash-generation = True
http-server.https.port = 8080
query.max-queued-queries = 5000
strict-mode-restrictions = MANDATORY_PARTITION_PREDICATE,DISALLOW_CROSS_JOIN,LIMITED_SORT
http-server.authentication.type = PASSWORD
internal-communication.shared-secret = 123
query.remote-task.terminate-on-connect-exception = True
optimizer.join-reordering-strategy = AUTOMATIC
sink.max-broadcast-buffer-size = 256MB
exchange.client-threads = 64
http-server.authentication.allow-insecure-over-http = True
exchange.max-buffer-size = 512MB
internal-communication.https.required = True
query.execution-policy = phased
http-server.process-forwarded = True
internal-communication.http2.enabled = False
node-scheduler.max-splits-per-node = 256
task.concurrency = 64
web-ui.enabled = True
task.http-response-threads = 200
optimizer.optimize-metadata-queries = True
protocol.v1.alternate-header-name = Presto
web-ui.preview.enabled = True
strict-mode-enabled = True
query.max-concurrent-queries = 180

Here is the catalog configuration

connector.name=hive
hive.metastore=glue
hive.metastore.glue.region=us-east-1
hive.metastore.glue.iam-role=arn:aws:iam::xx:role/xx
hive.metastore.glue.catalogid=xx
hive.metastore.glue.max-connections=100
hive.metastore.glue.max-error-retries=10
hive.non-managed-table-writes-enabled=true
hive.non-managed-table-creates-enabled=true
hive.storage-format=PARQUET
hive.max-split-size=512MB
hive.max-initial-split-size=256MB
hive.parquet.use-column-names=true
hive.orc.use-column-names=true
hive.max-partitions-per-writers=1000
hive.collect-column-statistics-on-write=true
hive.metastore-cache-ttl=4h
hive.metastore-refresh-interval=1h
hive.compression-codec=SNAPPY
fs.native-s3.enabled=true
s3.max-error-retries=50
s3.max-connections=2000
s3.streaming.part-size=128MB
s3.socket-read-timeout=30m
s3.socket-connect-timeout=30m
s3.connection-ttl=0s
s3.sse.type=S3
s3.iam-role=arn:aws:iam::x:role/xx
s3.canned-acl=BUCKET_OWNER_FULL_CONTROL
parquet.writer.block-size=128MB
parquet.writer.page-size=1MB
parquet.writer.batch-size=10000
hive.iceberg-catalog-name=ib

I am trying to benchmark the cluster against this fairly complex query to find out levers that I can pull to optimize

Link to query https://pastecode.io/s/xfeai33m
Link to query plan https://pastecode.io/s/rj9nx0hh

Count of rows
stg_dim_ad_group 61153245481
fact_sa_ad_group_dly 58372216116

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help with Cluster optimization for a 150 node Trino 457 cluster running on r6g.16xlarge #24817

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Help with Cluster optimization for a 150 node Trino 457 cluster running on r6g.16xlarge #24817

soham-dasgupta Jan 28, 2025

Replies: 0 comments

soham-dasgupta
Jan 28, 2025