Skip to content

Conversation

@kalavt
Copy link

@kalavt kalavt commented Dec 25, 2025

Description

Add optional Apache Arrow/Parquet encoder support to Docker images via unified EXTRA_CMAKE_FLAGS build argument.

Changes

  • Add conditional Parquet support in Dockerfile (Debian)
  • Add conditional Parquet support in Dockerfile.windows (vcpkg static linking)
  • Update dockerfiles/README.md with build examples
  • Add maintainer notes for Arrow SONAME version management

Usage

# Linux
docker build --build-arg EXTRA_CMAKE_FLAGS="-DFLB_PARQUET_ENCODER=On" -f dockerfiles/Dockerfile .

# Windows
docker build --build-arg EXTRA_CMAKE_FLAGS="-DFLB_PARQUET_ENCODER=On" -f dockerfiles/Dockerfile.windows .

Testing

Example configuration: See above usage section

Build test output:

  • Successfully built with Parquet support
  • Image size impact: +49MB (172MB vs 123MB)
  • Tested on Linux amd64 and Windows configurations

Valgrind: N/A (Docker packaging change only)

Packaging test:

  • Linux build validated
  • Windows build configuration validated
  • Documentation updated

Documentation

  • dockerfiles/README.md updated
  • Maintenance notes included

Backporting

N/A - New optional feature

Summary by CodeRabbit

  • New Features

    • Optional Parquet encoder support can be enabled at build time for Linux and Windows via a build-time flag.
  • Bug Fixes / Improvements

    • Ensures required compression/runtime libraries are included when Parquet is enabled.
    • Conditional installation and synchronized packaging across build, extractor, and debug images so optional components are only added when requested.
  • Documentation

    • Updated build docs with usage examples and notes about increased image size when Parquet support is enabled.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 25, 2025

📝 Walkthrough

Walkthrough

Adds an EXTRA_CMAKE_FLAGS build argument/ENV across Linux and Windows Docker build stages and conditionally provisions Apache Arrow/Parquet (and snappy) packages and CMake args when -DFLB_PARQUET_ENCODER=On; updates Docker README with example build commands to enable Parquet support.

Changes

Cohort / File(s) Summary
Linux Dockerfile
dockerfiles/Dockerfile
Adds ARG/ENV EXTRA_CMAKE_FLAGS in builder-base, deb-extractor, and debug stages; extends hadolint ignore list (adds SC2015); conditionally adds Arrow apt repo and installs libarrow-dev, libparquet-dev, libsnappy1v5 and related packages when EXTRA_CMAKE_FLAGS contains -DFLB_PARQUET_ENCODER=On.
Windows Dockerfile
dockerfiles/Dockerfile.windows
Adds ARG EXTRA_CMAKE_FLAGS (builder-base and builder stages); derives OPTIONAL_CMAKE_ARGS from EXTRA_CMAKE_FLAGS; conditionally sets Arrow_DIR/Parquet_DIR and injects them into CMake invocations; includes conditional Arrow/Parquet install logic and messaging.
Documentation
dockerfiles/README.md
Adds "Optional: Build with Parquet encoder support" sections and example build commands for Linux and Windows using EXTRA_CMAKE_FLAGS="-DFLB_PARQUET_ENCODER=On"; notes image-size impact for Arrow dependencies and minor formatting tweaks.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • patrick-stephens
  • niedbalski
  • celalettin1286

Poem

🐰 I nudged a flag into the build day's light,
Arrow crumbs and snappy bits tuck tight,
Parquet joins the image parade,
CMake hums while artifacts are made,
Tiny hops, big data delight.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'docker: add arrow/parquet support' accurately reflects the main change: adding optional Apache Arrow/Parquet encoder support to Docker images.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
dockerfiles/Dockerfile (1)

137-139: DRY concern: Arrow repository setup is duplicated across stages.

The Arrow apt repository setup logic (curl the deb, install it, update) is repeated in builder-base, deb-extractor, and debug stages (lines 61-67, 146-152, 303-309). Consider extracting this into a shared script or documenting why duplication is necessary (e.g., multi-stage isolation requirements).

Also applies to: 144-152

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b0c1ae and 78c38bc.

📒 Files selected for processing (3)
  • dockerfiles/Dockerfile
  • dockerfiles/Dockerfile.windows
  • dockerfiles/README.md
🧰 Additional context used
🧠 Learnings (10)
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • dockerfiles/Dockerfile
  • dockerfiles/README.md
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.

Applied to files:

  • dockerfiles/Dockerfile
  • dockerfiles/README.md
📚 Learning: 2025-08-07T10:15:46.187Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10691
File: .github/workflows/unit-tests.yaml:94-101
Timestamp: 2025-08-07T10:15:46.187Z
Learning: Apache Arrow latest packages have changing checksums, making hardcoded checksum verification impractical for CI workflows that download the latest version.

Applied to files:

  • dockerfiles/Dockerfile.windows
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-09-04T07:28:37.083Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10832
File: src/wasm/CMakeLists.txt:112-131
Timestamp: 2025-09-04T07:28:37.083Z
Learning: In fluent-bit CMake files, the user cosmo0920 prefers treating Git as a command rather than a package, emphasizing that Git is not a pkg-config retrievable package but just a command.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-08-29T06:24:44.797Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-08-29T06:24:55.855Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-09-08T11:21:33.975Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10851
File: include/fluent-bit/flb_simd.h:60-66
Timestamp: 2025-09-08T11:21:33.975Z
Learning: Fluent Bit currently only supports MSVC compiler on Windows, so additional compiler compatibility guards may be unnecessary for Windows-specific code paths.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.

Applied to files:

  • dockerfiles/README.md
🪛 LanguageTool
dockerfiles/README.md

[style] ~117-~117: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...ectures staging images used for testing prior to release - `ghcr.io/fluent/fluent-bit/ma...

(EN_WORDINESS_PREMIUM_PRIOR_TO)


[style] ~156-~156: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...Arrow with static linking. ## Contact Feel free to join us on our Mailing List or IRC: - ...

(FEEL_FREE_TO_STYLE_ME)

🪛 markdownlint-cli2 (0.18.1)
dockerfiles/README.md

25-25: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


30-30: Bare URL used

(MD034, no-bare-urls)


32-32: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


38-38: Bare URL used

(MD034, no-bare-urls)


40-40: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


58-58: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


59-59: Dollar signs used before commands without showing output

(MD014, commands-show-output)


66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


159-159: Bare URL used

(MD034, no-bare-urls)


160-160: Bare URL used

(MD034, no-bare-urls)


162-162: Bare URL used

(MD034, no-bare-urls)

🔇 Additional comments (8)
dockerfiles/README.md (2)

62-72: Good documentation for the optional Parquet feature.

The instructions for enabling Parquet encoder support are clear and include the useful note about image size impact. Consider adding a language specifier to the code block for syntax highlighting consistency:

-```
+```bash
 $ docker build -t fluent/fluent-bit --target=production \

[approve_code_changes, suggest_optional_refactor]


142-154: Windows Parquet documentation is consistent with Linux instructions.

The Windows-specific section appropriately mirrors the Linux documentation while noting the vcpkg/static linking approach. Consider adding the powershell language specifier to the code block:

-```powershell
+```powershell
 docker build --no-cache `

(Note: The block already has powershell - this is correct. LGTM.)

dockerfiles/Dockerfile.windows (2)

160-168: Conditional Arrow installation logic is well-structured.

The approach to conditionally install Arrow/Parquet only when explicitly requested via EXTRA_CMAKE_FLAGS keeps the default image lean. The cleanup steps after installation help reduce layer size.


197-210: CMake invocation correctly includes Arrow paths and extra flags.

The %ARROW_PATHS% and %EXTRA_CMAKE_FLAGS% variables are properly interpolated in the cmd shell context. When Parquet is not enabled, ARROW_PATHS will be empty, which is safe as CMake ignores empty arguments.

dockerfiles/Dockerfile (4)

61-68: Conditional Arrow/Parquet installation logic is sound.

The approach using grep -q to check for the Parquet flag is straightforward and effective. The installation fetches the Apache Arrow apt source dynamically based on the distribution, which is a good practice for portability.


172-172: Good addition of libsnappy1v5 for Parquet/Arrow support.

Snappy is a compression library used by Parquet. Adding it to the runtime dependencies ensures Parquet files using Snappy compression can be read.


83-84: EXTRA_CMAKE_FLAGS correctly propagated to CMake invocation.

The ARG/ENV pattern and usage in the cmake command via $EXTRA_CMAKE_FLAGS (line 108) follows the established pattern in this Dockerfile. The hadolint ignore for SC2086 at line 91 correctly documents that word splitting is intentional for multiple flags.

Also applies to: 108-108


268-271: Debug image correctly includes runtime Arrow libraries when Parquet is enabled.

The debug stage appropriately installs runtime libraries (libarrow2200, libparquet2200) rather than development libraries, which is correct for a debug/runtime image. The conditional logic mirrors the other stages.

Also applies to: 303-310

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 78c38bc and 1f8f3f6.

📒 Files selected for processing (2)
  • dockerfiles/Dockerfile
  • dockerfiles/Dockerfile.windows
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.
📚 Learning: 2025-08-07T10:15:46.187Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10691
File: .github/workflows/unit-tests.yaml:94-101
Timestamp: 2025-08-07T10:15:46.187Z
Learning: Apache Arrow latest packages have changing checksums, making hardcoded checksum verification impractical for CI workflows that download the latest version.

Applied to files:

  • dockerfiles/Dockerfile
  • dockerfiles/Dockerfile.windows
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • dockerfiles/Dockerfile
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.

Applied to files:

  • dockerfiles/Dockerfile
🔇 Additional comments (8)
dockerfiles/Dockerfile.windows (3)

19-20: LGTM!

The EXTRA_CMAKE_FLAGS build argument is correctly declared with an empty default value in the builder-base stage. This allows optional feature flags to be passed during build time.


160-168: LGTM!

The conditional Arrow installation logic is well-structured. The -like operator with the wildcard pattern correctly performs case-insensitive matching for the CMake flag, and the cleanup of temporary files helps reduce image size.


183-204: LGTM!

The builder stage correctly:

  1. Re-declares EXTRA_CMAKE_FLAGS (required for multi-stage builds to inherit the value)
  2. Uses findstr in cmd shell to detect the Parquet encoder flag
  3. Conditionally sets Arrow/Parquet directory paths for CMake
  4. The || cd . idiom correctly acts as a no-op when findstr fails (flag not present)
dockerfiles/Dockerfile (5)

24-26: LGTM!

The EXTRA_CMAKE_FLAGS argument and environment variable are correctly declared in the builder-base stage, enabling optional features like Parquet to be toggled at build time.


137-138: LGTM!

The deb-extractor stage correctly re-declares EXTRA_CMAKE_FLAGS and conditionally sets up the Arrow apt source before downloading packages. This ensures the Arrow/Parquet runtime packages are available when the feature is enabled.

Also applies to: 144-152


172-172: LGTM!

Adding libsnappy1v5 is correct—Snappy compression is commonly used by Parquet for data compression.


194-205: LGTM!

The maintainer comment documenting the Arrow SONAME versioning scheme (major100 + minor10) is helpful for future maintenance. The conditional download of Arrow runtime packages and their dependencies is correctly structured.


270-273: LGTM!

The EXTRA_CMAKE_FLAGS is correctly re-declared in the debug stage to enable conditional Arrow/Parquet package installation.

@kalavt kalavt force-pushed the feature/docker-arrow branch from 470ad07 to 0536f4e Compare December 25, 2025 12:31
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
dockerfiles/Dockerfile (3)

61-67: The "latest" apt source creates non-deterministic builds.

Using apache-arrow-apt-source-latest-trixie.deb makes builds non-reproducible. Apache Arrow's APT repository is a rolling feed with known metadata and checksum changes. Consider pinning to a specific Arrow version (e.g., apache-arrow-apt-source-18.0.0-trixie.deb) and explicitly versioning the -dev packages (e.g., libarrow-dev=18.0.0-1).

Note: This concern was previously raised but remains unaddressed in the current implementation.


145-151: Same "latest" apt source issue in deb-extractor stage.

This stage also uses apache-arrow-apt-source-latest-trixie.deb, creating the same non-deterministic build concern. Ensure the Arrow apt source is pinned to a specific version here as well to maintain consistency.


304-318: "Latest" apt source issue persists in debug stage; package consistency is good.

The debug stage also uses apache-arrow-apt-source-latest-trixie.deb (same non-deterministic concern as other stages). However, the maintainer note correctly reminds to keep packages synchronized with deb-extractor, and the package list (lines 310-316) matches the deb-extractor list (lines 197-203), ensuring consistency.

🧹 Nitpick comments (1)
dockerfiles/Dockerfile (1)

171-171: Move libsnappy1v5 to the conditional Parquet block.

libsnappy1v5 is a Parquet-specific dependency and is downloaded unconditionally, adding unnecessary bloat when Parquet support is disabled. Move it to the conditional block at lines 195-204 alongside the other Arrow/Parquet packages.

🔎 Suggested adjustment

Remove from line 171:

     libzstd1 \
     liblz4-1 \
-    libsnappy1v5 \
     libgssapi-krb5-2 \

Add to the conditional block:

     && if echo "${EXTRA_CMAKE_FLAGS}" | grep -q "FLB_PARQUET_ENCODER=On"; then \
         apt-get download \
             libarrow2200 \
             libparquet2200 \
+            libsnappy1v5 \
             libabsl20240722 \
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f8f3f6 and 470ad07.

📒 Files selected for processing (1)
  • dockerfiles/Dockerfile
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.
📚 Learning: 2025-08-07T10:15:46.187Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10691
File: .github/workflows/unit-tests.yaml:94-101
Timestamp: 2025-08-07T10:15:46.187Z
Learning: Apache Arrow latest packages have changing checksums, making hardcoded checksum verification impractical for CI workflows that download the latest version.

Applied to files:

  • dockerfiles/Dockerfile
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.

Applied to files:

  • dockerfiles/Dockerfile
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • dockerfiles/Dockerfile
🔇 Additional comments (4)
dockerfiles/Dockerfile (4)

24-26: LGTM! Clear documentation of the build argument.

The ARG/ENV pattern correctly propagates the build-time flag into the environment for use in conditional logic and child stages.


136-138: LGTM! Consistent propagation across stages.

The ARG/ENV pattern is correctly repeated in the deb-extractor stage to enable conditional package selection.


193-204: Well-documented maintainer guidance.

The maintainer note clearly explains the Arrow SONAME versioning scheme and when to update, addressing previous review feedback. The dependency list (abseil, protobuf, thrift, xml2) aligns with typical Arrow runtime requirements.


269-271: LGTM! Consistent across all stages.

The ARG/ENV pattern is correctly applied to the debug stage for conditional Arrow/Parquet installation.

@kalavt
Copy link
Author

kalavt commented Dec 25, 2025

@patrick-stephens for your review.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
dockerfiles/Dockerfile (2)

145-151: Arrow apt source version consistency concern.

Similar to the builder-base stage (lines 61-67), using apache-arrow-apt-source-latest-trixie.deb here makes the runtime package selection non-deterministic. Consider using the same pinned Arrow version across both build and runtime stages to ensure ABI compatibility between build-time and runtime Arrow libraries.


171-171: Consider making libsnappy1v5 conditional on Parquet support.

The libsnappy1v5 package is added unconditionally but is primarily needed for Parquet/Arrow compression. To minimize dependencies in the base image, consider moving this to the conditional block at lines 195-204 where other Arrow-related packages are downloaded.

🔎 Proposed change
     liblzma5 \
     libyaml-0-2 \
     libcap2 \
     libldap2 \
-    libsnappy1v5 \
     # MAINTAINER: Arrow SONAME format: libarrow{major*100+minor*10} (e.g., 22.0.0 -> libarrow2200)
     # Only update when upgrading Arrow version or if packages become unavailable in apt source.
     && if echo "${EXTRA_CMAKE_FLAGS}" | grep -q "FLB_PARQUET_ENCODER=On"; then \
         apt-get download \
             libarrow2200 \
             libparquet2200 \
+            libsnappy1v5 \
             libabsl20240722 \
dockerfiles/README.md (2)

62-72: Documentation clearly explains Parquet support for Linux builds.

The instructions accurately document the EXTRA_CMAKE_FLAGS usage and provide helpful information about the image size impact. Consider adding a language identifier to the code block for better syntax highlighting.

🔎 Optional: Add language identifier to code block
-```
+```bash
 $ docker build -t fluent/fluent-bit --target=production \
   --build-arg EXTRA_CMAKE_FLAGS="-DFLB_PARQUET_ENCODER=On" \
   -f dockerfiles/Dockerfile .

142-153: Windows Parquet documentation is clear and accurate.

The PowerShell build command correctly demonstrates EXTRA_CMAKE_FLAGS usage for Windows, and the vcpkg note provides useful implementation context. Consider adding a language identifier to the code block.

🔎 Optional: Add language identifier to code block
-```powershell
+```powershell
 docker build --no-cache `

(Note: The code block already has powershell specified at line 146, so this is already correct. No change needed.)

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 470ad07 and 0536f4e.

📒 Files selected for processing (3)
  • dockerfiles/Dockerfile
  • dockerfiles/Dockerfile.windows
  • dockerfiles/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • dockerfiles/Dockerfile.windows
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-09-04T07:28:37.083Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10832
File: src/wasm/CMakeLists.txt:112-131
Timestamp: 2025-09-04T07:28:37.083Z
Learning: In fluent-bit CMake files, the user cosmo0920 prefers treating Git as a command rather than a package, emphasizing that Git is not a pkg-config retrievable package but just a command.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.

Applied to files:

  • dockerfiles/README.md
  • dockerfiles/Dockerfile
📚 Learning: 2025-08-29T06:24:44.797Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-08-29T06:24:55.855Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-09-08T11:21:33.975Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10851
File: include/fluent-bit/flb_simd.h:60-66
Timestamp: 2025-09-08T11:21:33.975Z
Learning: Fluent Bit currently only supports MSVC compiler on Windows, so additional compiler compatibility guards may be unnecessary for Windows-specific code paths.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.

Applied to files:

  • dockerfiles/README.md
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • dockerfiles/README.md
  • dockerfiles/Dockerfile
📚 Learning: 2025-08-07T10:15:46.187Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10691
File: .github/workflows/unit-tests.yaml:94-101
Timestamp: 2025-08-07T10:15:46.187Z
Learning: Apache Arrow latest packages have changing checksums, making hardcoded checksum verification impractical for CI workflows that download the latest version.

Applied to files:

  • dockerfiles/Dockerfile
🪛 LanguageTool
dockerfiles/README.md

[style] ~117-~117: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...ectures staging images used for testing prior to release - `ghcr.io/fluent/fluent-bit/ma...

(EN_WORDINESS_PREMIUM_PRIOR_TO)


[style] ~156-~156: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...Arrow with static linking. ## Contact Feel free to join us on our Mailing List or IRC: - ...

(FEEL_FREE_TO_STYLE_ME)

🪛 markdownlint-cli2 (0.18.1)
dockerfiles/README.md

25-25: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


30-30: Bare URL used

(MD034, no-bare-urls)


32-32: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


38-38: Bare URL used

(MD034, no-bare-urls)


40-40: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


58-58: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


59-59: Dollar signs used before commands without showing output

(MD014, commands-show-output)


66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


159-159: Bare URL used

(MD034, no-bare-urls)


160-160: Bare URL used

(MD034, no-bare-urls)


162-162: Bare URL used

(MD034, no-bare-urls)

🔇 Additional comments (9)
dockerfiles/Dockerfile (7)

24-26: LGTM!

The EXTRA_CMAKE_FLAGS build argument and environment variable are correctly declared, enabling optional features like Parquet support to be controlled at build time.


82-83: LGTM!

The EXTRA_CMAKE_FLAGS is correctly redeclared in the builder stage, as Docker multi-stage builds require ARG redeclaration in each stage where it's used.


136-138: LGTM!

Correctly redeclares EXTRA_CMAKE_FLAGS for use in the deb-extractor stage's conditional Arrow/Parquet package provisioning.


193-204: Good maintainer documentation for Arrow SONAME versioning.

The maintainer comment clearly explains the Arrow SONAME naming convention and when updates are needed. This addresses the concern raised in earlier reviews about hardcoded package versions. The dependency list (libabsl20240722, libbz2-1.0, libprotobuf32t64, libthrift-0.19.0t64, libxml2) is comprehensive.


270-272: LGTM!

Correctly redeclares EXTRA_CMAKE_FLAGS for the debug stage's conditional Arrow/Parquet installation.


304-318: Package list correctly synchronized with deb-extractor.

The Arrow/Parquet package list matches the deb-extractor stage (lines 195-204), including all runtime dependencies. The maintainer comment helps ensure future updates maintain this synchronization. Note that the apache-arrow-apt-source-latest-trixie.deb usage here has the same reproducibility concern mentioned for earlier stages.


61-67: The proposed version pinning approach is not feasible; Apache Arrow does not publish versioned apt-source packages for Debian.

The apache-arrow-apt-source-latest-trixie.deb package is the only available release from Apache Arrow for this distribution. This package simply configures the Arrow repository; individual packages like libarrow-dev and libparquet-dev are versioned separately within that repository. To achieve reproducible builds, pin specific Arrow versions at the install stage instead (e.g., apt-get install -y libarrow-dev=14.0.0-1 libparquet-dev=14.0.0-1), or accept that this optional feature receives the latest available packages from the configured repository.

Likely an incorrect or invalid review comment.

dockerfiles/README.md (2)

115-119: LGTM!

Formatting change to use prefix bullets for consistency with other sections. No functional impact.


159-162: LGTM!

Formatting update to contact section. The consistent prefix style improves readability.

@kalavt kalavt force-pushed the feature/docker-arrow branch from 0536f4e to adbe478 Compare December 25, 2025 12:43
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
dockerfiles/README.md (2)

66-70: Consider adding language identifier to the code block.

The fenced code block at line 66 would benefit from a bash language identifier for improved syntax highlighting and consistency with Markdown best practices.

🔎 Proposed fix
-```
+```bash
 $ docker build -t fluent/fluent-bit --target=production \
   --build-arg EXTRA_CMAKE_FLAGS="-DFLB_PARQUET_ENCODER=On" \
   -f dockerfiles/Dockerfile .

146-151: Consider adding language identifier to the code block.

The fenced code block at line 146 would benefit from a powershell language identifier for improved syntax highlighting and consistency.

🔎 Proposed fix
-```powershell
+```powershell
 docker build --no-cache `
   --build-arg WINDOWS_VERSION=ltsc2019 `
   --build-arg EXTRA_CMAKE_FLAGS="-DFLB_PARQUET_ENCODER=On" `
   -t fluent/fluent-bit:master-windows -f ./dockerfiles/Dockerfile.windows .

Wait, looking at line 146, it already has powershell identifier. Let me re-check the annotated code... Line 146 shows:

```powershell

So the language identifier IS present. This means static analysis is wrong or I misread. Let me skip this comment.

Actually, wait - let me reread the code more carefully. The code shows:

136~
```powershell
...

So lines 136-140 already have powershell identifier. Then lines 146-151 are new:

146~

docker build --no-cache `
  --build-arg WINDOWS_VERSION=ltsc2019 `
  --build-arg EXTRA_CMAKE_FLAGS="-DFLB_PARQUET_ENCODER=On" `
  -t fluent/fluent-bit:master-windows -f ./dockerfiles/Dockerfile.windows .

So line 146 DOES have the powershell identifier. No issue here. Skip this comment.
[scratchpad_end] -->

<!-- [scratchpad_start]
[task review_file_1/3 segment_8: lines 159-162]
Review contact section formatting.
[observations]
- Lines 159-162 have tilde markers indicating changes.
- These are formatting/whitespace changes to contact links.
[analysis]
- Cosmetic changes.
- No functional impact.
- Skip.
[actions]
Skip comment.
[done]
README.md review complete.
[scratchpad_end] -->

</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 0536f4e3c00fcd8a2c37a69d275bd68c323b2f0b and adbe4783d9a235081448b0612480f22b535e3385.

</details>

<details>
<summary>📒 Files selected for processing (3)</summary>

* `dockerfiles/Dockerfile`
* `dockerfiles/Dockerfile.windows`
* `dockerfiles/README.md`

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🧠 Learnings (11)</summary>

<details>
<summary>📓 Common learnings</summary>

Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is FLB_PREFER_SYSTEM_LIB_KAFKA=ON.


Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.


</details>
<details>
<summary>📚 Learning: 2025-08-07T10:15:46.187Z</summary>

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10691
File: .github/workflows/unit-tests.yaml:94-101
Timestamp: 2025-08-07T10:15:46.187Z
Learning: Apache Arrow latest packages have changing checksums, making hardcoded checksum verification impractical for CI workflows that download the latest version.


**Applied to files:**
- `dockerfiles/Dockerfile`
- `dockerfiles/Dockerfile.windows`

</details>
<details>
<summary>📚 Learning: 2025-08-31T12:46:11.940Z</summary>

Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is FLB_PREFER_SYSTEM_LIB_KAFKA=ON.


**Applied to files:**
- `dockerfiles/Dockerfile`
- `dockerfiles/README.md`

</details>
<details>
<summary>📚 Learning: 2025-08-31T12:46:11.940Z</summary>

Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.


**Applied to files:**
- `dockerfiles/Dockerfile`
- `dockerfiles/README.md`

</details>
<details>
<summary>📚 Learning: 2025-09-14T09:46:09.531Z</summary>

Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.


**Applied to files:**
- `dockerfiles/README.md`

</details>
<details>
<summary>📚 Learning: 2025-09-04T07:28:37.083Z</summary>

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10832
File: src/wasm/CMakeLists.txt:112-131
Timestamp: 2025-09-04T07:28:37.083Z
Learning: In fluent-bit CMake files, the user cosmo0920 prefers treating Git as a command rather than a package, emphasizing that Git is not a pkg-config retrievable package but just a command.


**Applied to files:**
- `dockerfiles/README.md`

</details>
<details>
<summary>📚 Learning: 2025-08-29T06:24:44.797Z</summary>

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.


**Applied to files:**
- `dockerfiles/README.md`

</details>
<details>
<summary>📚 Learning: 2025-08-29T06:25:27.250Z</summary>

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like #ifdef FLB_HAVE_ZSTD. Unlike some other optional components such as ARROW/PARQUET (which use #ifdef FLB_HAVE_ARROW guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.


**Applied to files:**
- `dockerfiles/README.md`

</details>
<details>
<summary>📚 Learning: 2025-08-29T06:24:55.855Z</summary>

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.


**Applied to files:**
- `dockerfiles/README.md`

</details>
<details>
<summary>📚 Learning: 2025-09-08T11:21:33.975Z</summary>

Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10851
File: include/fluent-bit/flb_simd.h:60-66
Timestamp: 2025-09-08T11:21:33.975Z
Learning: Fluent Bit currently only supports MSVC compiler on Windows, so additional compiler compatibility guards may be unnecessary for Windows-specific code paths.


**Applied to files:**
- `dockerfiles/README.md`

</details>
<details>
<summary>📚 Learning: 2025-08-29T06:25:27.250Z</summary>

Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like #ifdef FLB_HAVE_ZSTD. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.


**Applied to files:**
- `dockerfiles/README.md`

</details>

</details><details>
<summary>🪛 LanguageTool</summary>

<details>
<summary>dockerfiles/README.md</summary>

[style] ~117-~117: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...ectures staging images used for testing prior to release - `ghcr.io/fluent/fluent-bit/ma...

(EN_WORDINESS_PREMIUM_PRIOR_TO)

---

[style] ~156-~156: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...Arrow with static linking.  ## Contact  Feel free to join us on our Mailing List or IRC:  - ...

(FEEL_FREE_TO_STYLE_ME)

</details>

</details>
<details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>

<details>
<summary>dockerfiles/README.md</summary>

25-25: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

30-30: Bare URL used

(MD034, no-bare-urls)

---

32-32: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

38-38: Bare URL used

(MD034, no-bare-urls)

---

40-40: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

58-58: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

59-59: Dollar signs used before commands without showing output

(MD014, commands-show-output)

---

66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

159-159: Bare URL used

(MD034, no-bare-urls)

---

160-160: Bare URL used

(MD034, no-bare-urls)

---

162-162: Bare URL used

(MD034, no-bare-urls)

</details>

</details>

</details>

<details>
<summary>🔇 Additional comments (10)</summary><blockquote>

<details>
<summary>dockerfiles/README.md (1)</summary><blockquote>

`115-119`: **LGTM—improved bullet formatting.**

The consistent bullet formatting enhances readability and maintains a cleaner structure for the ghcr.io topology documentation.

</blockquote></details>
<details>
<summary>dockerfiles/Dockerfile.windows (3)</summary><blockquote>

`19-21`: **LGTM—unified build argument added correctly.**

The `EXTRA_CMAKE_FLAGS` build argument is properly declared with a clear comment and sensible default value, enabling optional feature configuration.

---

`160-169`: **LGTM—conditional Arrow/Parquet installation implemented correctly.**

The conditional vcpkg installation uses appropriate PowerShell pattern matching and follows best practices with cleanup steps. The `-like` operator is case-insensitive, which correctly handles CMake flag variations.

---

`186-204`: **LGTM—CMake configuration correctly derives Arrow/Parquet paths.**

The conditional path derivation correctly uses cmd shell environment variables within a single RUN instruction. The `findstr` pattern matching and `|| cd .` fallback ensure the build chain succeeds whether or not Parquet support is enabled. The vcpkg package paths for Arrow and Parquet are correctly specified.

</blockquote></details>
<details>
<summary>dockerfiles/Dockerfile (6)</summary><blockquote>

`24-27`: **LGTM—build argument properly declared.**

The `EXTRA_CMAKE_FLAGS` argument is correctly declared as both ARG and ENV, with a clear comment explaining its purpose.

---

`61-67`: **LGTM—conditional Arrow package installation implemented correctly.**

The conditional logic correctly checks for the Parquet encoder flag and installs the necessary development packages. The apt source is added, updated, and cleaned up appropriately.

---

`82-83`: **LGTM—build argument correctly re-declared for builder stage.**

Multi-stage Dockerfile builds require ARG re-declaration in each stage that needs the value. This is correctly implemented.

---

`136-138`: **LGTM—build argument correctly declared in deb-extractor stage.**

Consistent with the other stages, the build argument is properly declared for the package extraction stage.

---

`192-204`: **LGTM—maintainer notes and conditional package downloads implemented correctly.**

The maintainer comment clearly documents the Arrow SONAME versioning scheme, addressing the maintenance concern from previous reviews. The conditional package downloads include all necessary Arrow/Parquet runtime dependencies.

---

`304-319`: **LGTM—debug stage Arrow packages correctly synchronized.**

The conditional Arrow package installation includes all dependencies from the deb-extractor stage, ensuring consistency between the production and debug images. The maintainer note provides clear guidance for future maintenance.

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant