You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To install the delta-spark python package on the spark image, you need to download pyspark.zip. pyspark.zip has more than 370 MB. Can I avoid increasing the size of the spark image?
FROM spark:3.5.3-scala2.12-java11-ubuntu
USER root
RUN set -ex; \
apt-get update; \
apt-get install -y python3 python3-pip; \
rm -rf /var/lib/apt/lists/*
RUN pip install requests aspectlib delta-spark;
ADD build/docker/aspectjweaver-1.9.22.1.jar /opt/spark
ADD build/docker/jars/ \
build/docker/datatunnel-3.5.0/ \
spark-jobserver-driver/target/spark-jobserver-driver-3.5.0.jar \
spark-jobserver-extensions/target/spark-jobserver-extensions-3.5.0.jar /opt/spark/jars/
USER spark
The text was updated successfully, but these errors were encountered:
Hi,
To minimize the size of the Spark image while adding delta-spark, I suggest we consider:
Using a lighter base image.
Installing only the necessary dependencies instead of the entire pyspark.
Implementing multi-stage builds to keep only essential files.
Cleaning up temporary files and caches after installations.
To install the delta-spark python package on the spark image, you need to download pyspark.zip. pyspark.zip has more than 370 MB. Can I avoid increasing the size of the spark image?
The text was updated successfully, but these errors were encountered: