Skip to content

Add workbench-session std/min variants, expand std optional packages#113

Open
ianpittwood wants to merge 9 commits into
mainfrom
deps/session-image-packages
Open

Add workbench-session std/min variants, expand std optional packages#113
ianpittwood wants to merge 9 commits into
mainfrom
deps/session-image-packages

Conversation

@ianpittwood
Copy link
Copy Markdown
Contributor

@ianpittwood ianpittwood commented May 22, 2026

Closes #89

Summary

  • Adds Standard (std, primary) and Minimal (min) variants to workbench-session, matching the workbench image pattern. Existing R{R}-python{P}-{os} tags continue to resolve to the std variant — variant suffix is opt-in.
  • Expands the std-variant optional package list for both workbench and workbench-session to cover the system dependencies of the most-downloaded CRAN packages and the common Python source-build paths.
  • Templates carry per-group Jinja {# … -#} annotations explaining what each cluster of packages supports. Rendered .txt files stay annotation-free until posit-dev/images-shared#550 lands the macro change that lets apt-get install skip # lines.

Variant change for workbench-session

bakery.yaml gains:

variants:
  - name: Standard
    extension: std
    tagDisplayName: std
    primary: true
  - name: Minimal
    extension: min
    tagDisplayName: min

Both variants ship the same R, Python, Jupyter, Quarto, TinyTeX, and Posit Pro Drivers — the only difference is the optional dev headers / tooling installed on std. The min variant is a starting point for customers who want strict control over which system libraries are present.

Tag examples after this change:

  • R4.5.3-python3.14.4-ubuntu-24.04 — std variant (default OS, default variant) — unchanged
  • R4.5.3-python3.14.4-ubuntu-24.04-std — explicit std
  • R4.5.3-python3.14.4-ubuntu-24.04-min — minimal variant
  • R4.5.3-python3.14.4-min — minimal variant on the default OS

The Helm chart's existing session.image.tag derivation (R{rVersion}-python{pythonVersion}-{os}) is unchanged — std is primary, so no-suffix tags still resolve to it.

Package selection — sources

The expanded optional lists were curated against three authoritative sources, in priority order:

  1. Posit Connect 2026.03.1 "System Dependencies of R Packages" Ubuntu 24.04 list (the Tier 1 baseline of 49 apt packages).
  2. The r-hub/r-system-requirements rules catalog — the same catalog that pak::pkg_sysreqs() and Posit Package Manager consult.
  3. The legacy rstudio/rstudio-docker-products/r-session-complete/Dockerfile.ubuntu2204 Workbench-specific apt block.

What's in the lists

The workbench std optional list grew from 45 → 87 packages on both Ubuntu 24.04 and 22.04. The workbench-session std optional list is 78 packages — the same Tier 1 set, minus:

  • workbench-server-only packages (sssd, libapparmor1, libedit2, oddjob-mkhomedir) — the session image is the runtime, not the server.
  • python3 / python3-dev / python3-venv — the matrix image already ships a uv-managed Python at /opt/python/\$PYTHON_VERSION/ that includes the standard library.
  • Entries already covered by the session base list (git, libpq-dev, krb5-user, libuser, libuser1-dev, rrdtool, subversion, xz-utils, ca-certificates, curl).

Packages are grouped in the templates by sysreq category. Representative R / Python consumers per group:

Group Representative consumers
Build toolchain Matrix, RcppArmadillo, nloptr, scipy/numpy from source
Compression CLI Quarto bundles, source archive extraction, conda envs
Compression headers data.table, httpuv, Pillow, scipy, lxml from source
TLS / HTTP / crypto curl, httr, httr2, openssl, urllib3, cryptography, sodium, plumber
FFI / readline / regex Python cffi/cryptography, source rebuilds of R + Python
XML / Unicode / text xml2, stringi, lxml, arrow
Database client headers RPostgres, RMariaDB, odbc, DBI, psycopg2, mysqlclient, pyodbc
Version control / network gert, git2r, renv from GitHub, pip git+ installs, SSH transports
Geospatial sf, terra, stars, geopandas, rasterio, pyproj
Graphics / fonts / imaging ragg, systemfonts, textshaping, magick, Cairo, Pillow, matplotlib
Math libraries igraph (GLPK), gmp, Rmpfr, gsl, RcppGSL
V8 / protobuf / arrow V8, RProtoBuf, arrow, httpuv source-build
PDF processing pdftools, pdfminer.six, pdf2image, magick PDF rasterization
X11 / OpenGL rgl, plotly 3D, configure-script probes
Tcl/Tk tcltk, Python tkinter
Java rJava, RJDBC, xlsx, JPype1 (requires R CMD javareconf after install)
Authentication / directory mongolite, pymongo GSSAPI, python-ldap, requests-kerberos

Notable renames

  • libmysqlclient-devdefault-libmysqlclient-dev + libmariadb-dev + libmariadb-dev-compat, matching Posit's documented Connect 24.04 list. The -compat package provides the mysql.h symlink so code expecting MySQL Connector headers keeps building.
  • libfreetype-devlibfreetype6-dev for parity with Posit's docs across both OSes.

Intentional exclusions

Following the analysis's "Tier 2 / borderline" guidance, the following are intentionally left out — they're available for customers to add via extension Dockerfiles, but bundling them by default would inflate image size and CVE surface for a relatively narrow audience:

  • texlive — Quarto's TinyTeX is already installed and covers the same paths.
  • tesseract-ocr / libtesseract-dev — OCR is niche; ~22 MB and pulls 5 dependencies.
  • ffmpeg — ~80 MB; only needed for imageio-ffmpeg, pyav, manim.
  • graphviz / graphviz-devDiagrammeR / pygraphviz users can layer on; pulls a chain of X libs.
  • libhdf5-dev / libnetcdf-dev — Bioconductor / h5py / xarray heavy users only.
  • jagsrjags / R2jags is statistical-modeling-niche; known CVE source.
  • libreoffice-core / unoconv — huge for an officer corner-case.
  • libopencv-dev, libavfilter-devopencv-python ships wheels.
  • nodejs / npm — Quarto bundles Node; libnode-dev already covers V8.
  • libhiredis-dev, librdkafka-dev — Redis / Kafka clients are ecosystem-specific.
  • freetds-dev — MS SQL via FreeTDS is niche; the Posit Pro Drivers stack already covers SQL Server over ODBC.
  • libtool, pkgconf — pulled transitively when needed by pkg-config / autoconf.
  • texinfo — modern Rd2pdf flow goes through Quarto / TinyTeX.

Files touched

  • bakery.yaml — new workbench-session variants block.
  • workbench-session/template/Containerfile.ubuntu{22,24}04.jinja2 — conditionally COPY+install optional packages on std only.
  • workbench-session/template/test/goss.yaml.jinja2 — asserts optional packages only when IMAGE_VARIANT == \"Standard\".
  • workbench-session/template/deps/ubuntu-{22,24}.04_optional_packages.txt.jinja2new files (std-only).
  • workbench/template/deps/ubuntu-{22,24}.04_optional_packages.txt.jinja2 — expanded with the same grouped / annotated structure.
  • workbench-session/README.md — documents the new variants and tag format.
  • All workbench/{2025.09,2026.01,2026.04}/deps/ubuntu-*_optional_packages.txt and workbench-session/matrix/... — re-rendered from templates by bakery update files.

Bakery file-update note

bakery update files also re-renders other Containerfiles in the repo that have drifted from their templates (likely from older bakery versions or template tweaks that landed without a re-render). Those unrelated drifts were reverted by hand to keep this PR focused on the intended scope. A follow-up cleanup PR can take care of those.

The compass_artifact_…_text_markdown.md research note in the repo root is intentionally not committed — it can be removed locally.

Test plan

  • CI green for session.yml (matrix builds for both std and min variants on both OSes).
  • CI green for production.yml (workbench std installs the expanded optional list cleanly).
  • bakery run dgoss --image-name workbench-session passes on both std (asserts new optional packages installed) and min (does not).
  • bakery run dgoss --image-name workbench passes on std with the expanded list.
  • Spot-check image size delta vs. prior std builds — expect a noticeable but bounded growth from the additional dev headers; track over time as a CVE/size budget conversation.
  • Pull a workbench-session std image and verify a sampling of new packages: apt-cache policy libgdal-dev libabsl-dev default-libmysqlclient-dev libmariadb-dev-compat libuv1-dev libpoppler-cpp-dev.

🤖 Generated with Claude Code

@ianpittwood ianpittwood force-pushed the deps/session-image-packages branch from bc2c3fc to 45a6ddb Compare June 2, 2026 19:30
ianpittwood and others added 4 commits June 3, 2026 08:47
- bakery.yaml: add Standard (std, primary) and Minimal (min) variants
  to workbench-session, mirroring workbench. Existing tags without a
  variant suffix continue to resolve to std.
- workbench-session/template: conditionally COPY and install optional
  packages on the std variant only, assert in goss on Standard only,
  document the new tag format and variants in the README.
- workbench std + workbench-session std optional package lists:
  expand to a curated dev-header + tooling set covering the most-
  downloaded CRAN packages and common Python source-build paths.
  Templates are grouped with Jinja {# ... -#} annotations explaining
  each group; rendered .txt files stay annotation-free until
  posit-dev/images-shared#550 lands.

Rationale, exclusion list, and tag examples are documented in the
PR description.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ianpittwood ianpittwood force-pushed the deps/session-image-packages branch from 45a6ddb to 1b7e155 Compare June 3, 2026 14:51
ianpittwood and others added 5 commits June 3, 2026 09:56
default-libmysqlclient-dev is a metapackage that on jammy resolves to the
real libmysqlclient-dev, which Conflicts: with libmariadb-dev-compat (both
provide mysql.h / libmysqlclient). On noble it merely re-pulls the two
MariaDB packages we already list explicitly.

Keep libmariadb-dev + libmariadb-dev-compat: this matches Posit Connect's
documented Ubuntu 24.04 list and the -compat package still provides the
mysql.h / libmysqlclient symlinks that mysqlclient (PyPI), RMySQL, and
RMariaDB build against.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ianpittwood ianpittwood marked this pull request as ready for review June 4, 2026 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Evaluate system package lists for Workbench images

2 participants