[VL] Few TPC-H queries perform worse after increasing off-heap memory #8571

VaibhavFRI · 2025-01-20T05:55:21Z

Backend

VL (Velox)

Bug description

Hi team,

I am experimenting with off-heap memory to understand its effect on the TPC-H Benchmark for Gluten + Velox backend. However, I noticed that increasing off-heap memory does not consistently speed up all 22 TPC-H queries. The speedup is not uniform, and some queries, such as Q18, Q19, Q20, Q11, and Q3, either perform worse or show no noticeable improvement.

Can you help me understand the effect of off-heap memory on query performance and how I might improve performance for these specific queries?

Experiments Conducted:
I tested the following off-heap and on-heap memory combinations:

Off-heap memory: 6GB, Executor memory: 30GB, Executors: 4x4
Off-heap memory: 12GB, Executor memory: 30GB, Executors: 4x4
Off-heap memory: 20GB, Executor memory: 30GB, Executors: 4x4
Test Environment:
Instance: ARM-based AWS instance (m7g.4xlarge)
VCPUs: 16
Memory: 64GB
Spark Version: 3.5.2
Data Size: Scale Factor SF=100
Any insights or recommendations on optimizing these queries with off-heap memory would be greatly appreciated.

Spark version

Spark-3.5.x

Spark configurations

cat tpch_parquet.scala | ${SPARK_HOME}/bin/spark-shell
--master spark://172.32.5.244:7077 --deploy-mode client
--conf spark.plugins=org.apache.gluten.GlutenPlugin
--conf spark.driver.extraClassPath=${GLUTEN_JAR}
--conf spark.executor.extraClassPath=${GLUTEN_JAR}
--conf spark.memory.offHeap.enabled=true
--conf spark.memory.offHeap.size=
--conf spark.gluten.sql.columnar.forceShuffledHashJoin=true
--conf spark.driver.memory=4G
--conf spark.executor.instances=4
--conf spark.executor.memory=7500m
--conf spark.executor.cores=4
--conf spark.sql.shuffle.partitions=32
--conf spark.executor.memoryOverhead=2g
--conf spark.driver.maxResultSize=2g
--conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager
--conf spark.driver.extraJavaOptions="--illegal-access=permit -Dio.netty.tryReflectionSetAccessible=true --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.util=ALL-UNNAMED"
--conf spark.executor.extraJavaOptions="--illegal-access=permit -Dio.netty.tryReflectionSetAccessible=true --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.util=ALL-UNNAMED" \

System information

Gluten Version: 1.3.0-SNAPSHOT
Commit: 4dfdfd7
CMake Version: 3.28.3
System: Linux-6.8.0-1021-aws
Arch: aarch64
CPU Name:
C++ Compiler: /usr/bin/c++
C++ Compiler Version: 11.4.0
C Compiler: /usr/bin/cc
C Compiler Version: 11.4.0
CMake Prefix Path: /usr/local;/usr;/;/usr/local/lib/python3.10/dist-packages/cmake/data;/usr/local;/usr/X11R6;/usr/pkg;/opt

Relevant logs

The text was updated successfully, but these errors were encountered:

VaibhavFRI added bug Something isn't working triage labels Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VL] Few TPC-H queries perform worse after increasing off-heap memory #8571

[VL] Few TPC-H queries perform worse after increasing off-heap memory #8571

VaibhavFRI commented Jan 20, 2025

[VL] Few TPC-H queries perform worse after increasing off-heap memory #8571

[VL] Few TPC-H queries perform worse after increasing off-heap memory #8571

Comments

VaibhavFRI commented Jan 20, 2025

Backend

Bug description

Spark version

Spark configurations

System information

Relevant logs