Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: Clean up and reorganize backend container build #11160

Open
hangy opened this issue Dec 21, 2024 · 3 comments
Open

build: Clean up and reorganize backend container build #11160

hangy opened this issue Dec 21, 2024 · 3 comments
Labels
👩‍💻 DevOps 🐋 Docker https://docker-curriculum.com/ docker Pull requests that update Docker code

Comments

@hangy
Copy link
Member

hangy commented Dec 21, 2024

We currently install around 1.6 GiB worth of Debian packages (measured by the uncompressed container layer size) when building the backend image.

# Install cpm to install cpanfile dependencies
RUN --mount=type=cache,id=apt-cache,target=/var/cache/apt set -x && \
apt update && \
apt install -y \
apache2 \
apt-utils \
cpanminus \
# being able to build things
g++ \
gcc \
less \
libapache2-mod-perl2 \
make \
gettext \
wget \
# images processing
imagemagick \
graphviz \
tesseract-ocr \
# ftp client
lftp \
# some compression utils
gzip \
tar \
unzip \
zip \
pigz \
# useful to send mail
mailutils \
# perlmagick \
#
# Packages from ./cpanfile:
# If cpanfile specifies a newer version than apt has, cpanm will install the newer version.
#
libtie-ixhash-perl \
libwww-perl \
libimage-magick-perl \
libxml-encoding-perl \
libtext-unaccent-perl \
libmime-lite-perl \
libcache-memcached-fast-perl \
libjson-pp-perl \
libclone-perl \
libcrypt-passwdmd5-perl \
libencode-detect-perl \
libgraphics-color-perl \
libbarcode-zbar-perl \
libxml-feedpp-perl \
liburi-find-perl \
libxml-simple-perl \
libexperimental-perl \
libapache2-request-perl \
libdigest-md5-perl \
libtime-local-perl \
libdbd-pg-perl \
libtemplate-perl \
liburi-escape-xs-perl \
# NB: not available in ubuntu 1804 LTS:
libmath-random-secure-perl \
libfile-copy-recursive-perl \
libemail-stuffer-perl \
liblist-moreutils-perl \
libexcel-writer-xlsx-perl \
libpod-simple-perl \
liblog-any-perl \
liblog-log4perl-perl \
liblog-any-adapter-log4perl-perl \
# NB: not available in ubuntu 1804 LTS:
libgeoip2-perl \
libemail-valid-perl
RUN --mount=type=cache,id=apt-cache,target=/var/cache/apt set -x && \
apt install -y \
#
# cpan dependencies that can be satisfied by apt even if the package itself can't:
#
# Action::Retry
libmath-fibonacci-perl \
# EV - event loop
libev-perl \
# Algorithm::CheckDigits
libprobe-perl-perl \
# CLDR::Number
libmath-round-perl \
libsoftware-license-perl \
libtest-differences-perl \
libtest-exception-perl \
# Data::Dumper::AutoEncode
# NB: not available in ubuntu 1804 LTS:
libmodule-build-pluggable-perl \
libclass-accessor-lite-perl \
# DateTime
libclass-singleton-perl \
# DateTime::Locale
libfile-sharedir-install-perl \
# File::chmod::Recursive
libfile-chmod-perl \
# GeoIP2
libdata-dumper-concise-perl \
libdata-printer-perl \
libdata-validate-ip-perl \
libio-compress-perl \
libjson-maybexs-perl \
libcpanel-json-xs-perl \
liblist-allutils-perl \
liblist-someutils-perl \
# GraphViz2
libdata-section-simple-perl \
libfile-which-perl \
libipc-run3-perl \
liblog-handler-perl \
libtest-deep-perl \
libwant-perl \
# Image::OCR::Tesseract
libfile-find-rule-perl \
liblinux-usermod-perl \
# Locale::Maketext::Lexicon::Getcontext
liblocale-maketext-lexicon-perl \
# Log::Any::Adapter::TAP
liblog-any-adapter-tap-perl \
# Math::Random::Secure
libcrypt-random-source-perl \
libmath-random-isaac-perl \
libtest-sharedfork-perl \
libtest-warn-perl \
# Mojo::Pg
libsql-abstract-perl \
# MongoDB
libauthen-sasl-saslprep-perl \
libauthen-scram-perl \
libbson-perl \
libclass-xsaccessor-perl \
libconfig-autoconf-perl \
libdigest-hmac-perl \
libpath-tiny-perl \
libsafe-isa-perl \
# Spreadsheet::CSV
libspreadsheet-parseexcel-perl \
# Test::Number::Delta
libtest-number-delta-perl \
libdevel-size-perl \
gnumeric \
# for dev
# gnu readline
libreadline-dev \
# IO::AIO needed by Perl::LanguageServer
libperl-dev \
# needed to build Apache2::Connection::XForwardedFor
libapache2-mod-perl2-dev \
# Imager::zxing - build deps
cmake \
pkg-config \
# Imager::zxing - decoders
libavif-dev \
libde265-dev \
libheif-dev \
libjpeg-dev \
libpng-dev \
libwebp-dev \
libx265-dev

The statements result in the 722 MB and 922 MB layers in the final image:

hangy@xxx:~/off/openfoodfacts-server-main$ docker history ghcr.io/openfoodfacts/openfoodfacts-server/backend:latest
IMAGE          CREATED       CREATED BY                                      SIZE      COMMENT
596b1762876e   2 days ago    CMD ["apache2ctl" "-D" "FOREGROUND"]            0B        buildkit.dockerfile.v0
<missing>      2 days ago    ENTRYPOINT ["/docker-entrypoint.sh"]            0B        buildkit.dockerfile.v0
<missing>      2 days ago    USER www-data                                   0B        buildkit.dockerfile.v0
<missing>      2 days ago    WORKDIR /opt/product-opener/                    0B        buildkit.dockerfile.v0
<missing>      2 days ago    COPY ./docker/docker-entrypoint.sh / # build…   2.39kB    buildkit.dockerfile.v0
<missing>      2 days ago    EXPOSE map[80/tcp:{}]                           0B        buildkit.dockerfile.v0
<missing>      2 days ago    COPY . /opt/product-opener/ # buildkit          423MB     buildkit.dockerfile.v0
<missing>      2 weeks ago   RUN /bin/sh -c mkdir -p var/run/apache2/ && …   1.11MB    buildkit.dockerfile.v0
<missing>      2 weeks ago   RUN /bin/sh -c a2dismod mpm_event &&     a2e…   68B       buildkit.dockerfile.v0
<missing>      2 weeks ago   ENV PATH=/opt/perl/local/bin:/usr/local/sbin…   0B        buildkit.dockerfile.v0
<missing>      2 weeks ago   ENV PERL5LIB=/opt/product-opener/lib/:/opt/p…   0B        buildkit.dockerfile.v0
<missing>      2 weeks ago   COPY /tmp/local/ /opt/perl/local/ # buildkit    83.8MB    buildkit.dockerfile.v0
<missing>      2 weeks ago   RUN /bin/sh -c rm /etc/apache2/sites-enabled…   0B        buildkit.dockerfile.v0
<missing>      2 weeks ago   RUN |2 USER_UID=1000 USER_GID=1000 /bin/sh -…   328kB     buildkit.dockerfile.v0
<missing>      2 weeks ago   ARG USER_GID                                    0B        buildkit.dockerfile.v0
<missing>      2 weeks ago   ARG USER_UID                                    0B        buildkit.dockerfile.v0
<missing>      2 weeks ago   RUN /bin/sh -c set -x &&     cd /tmp &&     …   1.27MB    buildkit.dockerfile.v0
<missing>      2 weeks ago   RUN /bin/sh -c set -x &&     apt install -y …   922MB     buildkit.dockerfile.v0
<missing>      2 weeks ago   RUN /bin/sh -c set -x &&     apt update &&  …   722MB     buildkit.dockerfile.v0
<missing>      2 weeks ago   # debian.sh --arch 'amd64' out/ 'bullseye' '…   124MB     debuerreotype 0.15
  1. Review which packages are necessary to actually run ProductOpener
  2. Review which packages might be necessary for local development but not production deployments
  3. Review which packages are necessary during build (ie. make, g++, *-dev variants of packages) only
  4. Remove essentially unnecessary packages (ie. now unused Perl packages)
  5. For Perl modules in the If cpanfile specifies a newer version than apt has, cpanm will install the newer version section: Figure out which modules will definitely be replaced by cpanm, and remove redundant versions from the Dockerfile.

Based on this information, it might be good to use more build layers to split up the "backend" container image into a "backend-dev" and "backend-run" image, where a potential "backend-run" image should not contain any binaries that are not strictly necessary for deploying/running ProductOpener. This is not only useful to reduce image size, can also reduce the number of attack vectors.

@hangy hangy added docker Pull requests that update Docker code 🐋 Docker https://docker-curriculum.com/ labels Dec 21, 2024
@github-project-automation github-project-automation bot moved this to To discuss and validate in 🍊 Open Food Facts Server issues Dec 21, 2024
@benbenben2
Copy link
Collaborator

The image used by the runner comes with packages. An option could be to remove unnecessary packages:
https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2204-Readme.md

@hangy
Copy link
Member Author

hangy commented Dec 21, 2024

tbh, my main concern with this task was less about the actions runner running out of space, but about the image in general. One idea is to use the container image for production deployments (IIRC it's already being used for some domain?), and that really should come with a cleaner container image, so that deployments are faster, smaller, and have a lesser chance of coming with CVEs 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
👩‍💻 DevOps 🐋 Docker https://docker-curriculum.com/ docker Pull requests that update Docker code
Projects
Status: To discuss and validate
Development

No branches or pull requests

3 participants