-
Notifications
You must be signed in to change notification settings - Fork 1.8k
out_azure_blob: add zstd compression support #11202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughAdds generalized compression support (NONE/GZIP/ZSTD) to the Azure Blob output plugin, introducing compression helpers, per-path/network compression flags, content-type/encoding handling for ZSTD, URI extension logic, configuration parsing changes, and unit tests for compression behaviors. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant Plugin as AzureBlobPlugin
participant Compressor as Compressor
participant HTTP as HTTPClient
participant Net as Network
App->>Plugin: Submit payload + ctx->compression (NONE/GZIP/ZSTD)
Plugin->>Plugin: Decide network vs blob compression flags
alt compression != NONE
Plugin->>Compressor: azure_blob_compress_payload(algorithm, data)
alt compress succeeds
Compressor-->>Plugin: compressed payload
Plugin->>Plugin: mark applied flags (blob/network)
else compress fails
Compressor-->>Plugin: error
Plugin->>Plugin: revert to original payload
end
else
Plugin->>Plugin: keep original payload
end
Plugin->>Plugin: ext = azb_blob_extension()
Plugin->>HTTP: azb_http_canonical_request(content_type, content_encoding)
HTTP->>Net: Send PUT with Content-Type & Content-Encoding
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–25 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
plugins/out_azure_blob/azure_blob_blockblob.c (1)
92-92: Minor:const char*assigned tochar*pointer.The function
azb_blob_extensionreturnsconst char*, butextis declared aschar*. While this works, it's technically a const-correctness issue.Consider updating the variable declaration for consistency:
- char *ext; + const char *ext;tests/internal/azure_blob.c (1)
128-143: Clarify test name to match behavior.The test
test_block_blob_extension_gzip_defaultsetscompression = FLB_COMPRESSION_ALGORITHM_NONEbut expects.gz. This correctly tests the fallback behavior inazb_blob_extension()where gzip is the default whencompress_blobis enabled without a specific algorithm. Consider adding a brief comment to clarify this intent.static void test_block_blob_extension_gzip_default() { struct flb_azure_blob ctx; flb_sds_t uri; azure_blob_ctx_init(&ctx); ctx.compress_blob = FLB_TRUE; - ctx.compression = FLB_COMPRESSION_ALGORITHM_NONE; + ctx.compression = FLB_COMPRESSION_ALGORITHM_NONE; /* defaults to gzip when compress_blob is enabled */ uri = azb_block_blob_uri(&ctx, "file", "block", 123, "rand");
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
plugins/out_azure_blob/azure_blob.c(6 hunks)plugins/out_azure_blob/azure_blob.h(2 hunks)plugins/out_azure_blob/azure_blob_blockblob.c(4 hunks)plugins/out_azure_blob/azure_blob_conf.c(2 hunks)plugins/out_azure_blob/azure_blob_http.c(3 hunks)plugins/out_azure_blob/azure_blob_http.h(1 hunks)tests/internal/CMakeLists.txt(1 hunks)tests/internal/azure_blob.c(1 hunks)
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.170Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.561Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.
Applied to files:
plugins/out_azure_blob/azure_blob.hplugins/out_azure_blob/azure_blob_blockblob.cplugins/out_azure_blob/azure_blob_http.cplugins/out_azure_blob/azure_blob_conf.ctests/internal/azure_blob.cplugins/out_azure_blob/azure_blob.c
📚 Learning: 2025-08-29T06:24:26.170Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.170Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.
Applied to files:
plugins/out_azure_blob/azure_blob.hplugins/out_azure_blob/azure_blob_blockblob.cplugins/out_azure_blob/azure_blob_http.cplugins/out_azure_blob/azure_blob_conf.cplugins/out_azure_blob/azure_blob.c
📚 Learning: 2025-08-29T06:25:02.561Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.561Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.
Applied to files:
plugins/out_azure_blob/azure_blob.hplugins/out_azure_blob/azure_blob_blockblob.cplugins/out_azure_blob/azure_blob_http.cplugins/out_azure_blob/azure_blob_conf.cplugins/out_azure_blob/azure_blob.c
📚 Learning: 2025-08-29T06:24:55.855Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.855Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.
Applied to files:
plugins/out_azure_blob/azure_blob.hplugins/out_azure_blob/azure_blob_blockblob.cplugins/out_azure_blob/azure_blob_http.cplugins/out_azure_blob/azure_blob_conf.cplugins/out_azure_blob/azure_blob.c
📚 Learning: 2025-08-29T06:25:27.250Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.250Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.
Applied to files:
plugins/out_azure_blob/azure_blob.hplugins/out_azure_blob/azure_blob_blockblob.cplugins/out_azure_blob/azure_blob_http.cplugins/out_azure_blob/azure_blob_conf.cplugins/out_azure_blob/azure_blob.c
📚 Learning: 2025-08-29T06:24:44.797Z
Learnt from: shadowshot-x
Repo: fluent/fluent-bit PR: 10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.797Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.
Applied to files:
plugins/out_azure_blob/azure_blob_blockblob.cplugins/out_azure_blob/azure_blob_http.cplugins/out_azure_blob/azure_blob_conf.cplugins/out_azure_blob/azure_blob.c
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.
Applied to files:
plugins/out_azure_blob/azure_blob_blockblob.cplugins/out_azure_blob/azure_blob.c
📚 Learning: 2025-11-21T06:23:29.770Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.770Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.
Applied to files:
plugins/out_azure_blob/azure_blob_blockblob.cplugins/out_azure_blob/azure_blob.c
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.
Applied to files:
plugins/out_azure_blob/azure_blob.c
📚 Learning: 2025-09-08T11:21:33.975Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 10851
File: include/fluent-bit/flb_simd.h:60-66
Timestamp: 2025-09-08T11:21:33.975Z
Learning: Fluent Bit currently only supports MSVC compiler on Windows, so additional compiler compatibility guards may be unnecessary for Windows-specific code paths.
Applied to files:
plugins/out_azure_blob/azure_blob.c
🧬 Code graph analysis (4)
plugins/out_azure_blob/azure_blob_http.c (1)
src/flb_http_client.c (1)
flb_http_add_header(963-995)
plugins/out_azure_blob/azure_blob_conf.c (1)
src/flb_output.c (1)
flb_output_get_property(1108-1111)
tests/internal/azure_blob.c (2)
plugins/out_azure_blob/azure_blob_blockblob.c (1)
azb_block_blob_uri(72-121)plugins/out_azure_blob/azure_blob_http.c (1)
azb_http_client_setup(299-382)
plugins/out_azure_blob/azure_blob.c (2)
src/flb_gzip.c (1)
flb_gzip_compress(157-252)src/flb_zstd.c (1)
flb_zstd_compress(33-57)
🔇 Additional comments (18)
plugins/out_azure_blob/azure_blob_http.h (1)
31-35: LGTM!The signature change to add
content_encodingparameter is well-designed, cleanly separating content type (payload format) from content encoding (transfer encoding). This aligns with the broader compression refactoring in this PR.tests/internal/CMakeLists.txt (1)
21-21: LGTM!The new
azure_blob.ctest file is correctly added to the unit test list. Based on learnings, ZSTD is always available in Fluent Bit as a core dependency, so no conditional compilation guards are needed.plugins/out_azure_blob/azure_blob_blockblob.c (1)
35-46: LGTM!The
azb_blob_extensionhelper cleanly centralizes extension selection logic. The fallback to.gzwhen compression is enabled but not ZSTD maintains backward compatibility with existing gzip behavior.plugins/out_azure_blob/azure_blob.h (2)
33-39: LGTM!The new
AZURE_BLOB_CT_ZSTDandAZURE_BLOB_CE_ZSTDmacros follow the existing naming conventions and integer value pattern, cleanly extending the compression support.
59-59: LGTM!The field rename from
compress_gziptocompressionis a good generalization that supports the multi-algorithm compression model introduced in this PR.plugins/out_azure_blob/azure_blob_http.c (3)
196-198: LGTM!The ZSTD content encoding branch correctly maps
AZURE_BLOB_CE_ZSTDto the "zstd" encoding string, following the same pattern as the existing GZIP handling.
229-231: LGTM!The ZSTD content type mapping to "application/zstd" is correct and follows the existing pattern for GZIP.
324-339: LGTM!The HTTP header additions for ZSTD content type and content encoding are correctly implemented, mirroring the existing GZIP handling pattern.
plugins/out_azure_blob/azure_blob_conf.c (1)
659-673: LGTM!The compression configuration parsing is well-implemented:
- Uses
FLB_COMPRESSION_ALGORITHM_NONEas the sensible default- Case-insensitive matching improves user experience
- Clear error message listing valid options helps with troubleshooting
- Returning NULL on invalid values prevents running with misconfiguration
tests/internal/azure_blob.c (3)
42-74: LGTM!The fixture initialization/cleanup helpers properly handle resource management with appropriate error paths and cleanup on partial failures.
163-212: LGTM!Good test coverage for ZSTD content encoding header verification. The test correctly validates that
Content-Encoding: zstdis set when usingAZURE_BLOB_CE_ZSTD.
258-265: LGTM!The test list is properly structured with the null terminator. Good coverage of both extension behavior and HTTP header behavior for the new ZSTD support.
plugins/out_azure_blob/azure_blob.c (6)
27-28: LGTM!The new includes for
flb_compression.handflb_zstd.hare correctly added. Based on learnings, ZSTD is a core dependency in Fluent Bit and doesn't require conditional compilation guards.
154-178: LGTM!The helper functions
azure_blob_compress_payloadandazure_blob_compression_nameare well-implemented:
- Clean dispatch to the appropriate compression function
- Safe fallback to "unknown" for invalid algorithms in the name function
- Consistent return values (-1 for failure, 0 for success)
393-428: LGTM!The compression logic is well-structured:
- Properly defaults to GZIP when
compress_blobis enabled without a specific algorithm- Gracefully handles compression failures by falling back to uncompressed with a warning
- Correctly tracks which type of compression was applied (network vs blob)
1853-1861: LGTM!The config map descriptions are well-updated to document the new ZSTD support alongside GZIP, with clear explanations of behavior and compatibility constraints.
432-450: <!-- [scratchpad_start]
[task verify_review_comment_azure_blob_headers]
[observations]
- Web search results show practical Azure Blob Storage examples and patterns
- Key finding from search results [1], [4], [7]: when uploading JSON compressed with gzip:
- Content-Type: application/json + Content-Encoding: gzip (transparent transfer encoding)
- This tells clients the original content is JSON and it's currently gzip encoded
- Pattern from [6]: Content-Type: text/css + Content-Encoding: gzip for CSS files
- Azure semantics confirmed across all sources
- Code logic being reviewed:
- blob_compression_applied → Content-Type only (gzip/zstd)
- network_compression_applied → Content-Encoding (gzip/zstd) + Content-Type: json
[analysis]- CRITICAL ISSUE IDENTIFIED: The code logic appears INCORRECT
- When blob_compression_applied is true, setting Content-Type to application/gzip without Content-Encoding is correct (blob is stored as a gzip file)
- When network_compression_applied is true, setting Content-Encoding + Content-Type: json is correct
- BUT: The code distinguishes these as mutually exclusive paths, not considering the semantic meaning
- The ISSUE: When blob_compression_applied=true, the code sets content_type to AZURE_BLOB_CT_GZIP (application/gzip presumably)
- This means "the blob IS a gzip file" (not auto-decompressed)
- content_encoding = AZURE_BLOB_CE_NONE confirms this (no encoding)
- This is CORRECT for stored blob compression
- When network_compression_applied=true, the code sets:
- Content-Encoding to gzip/zstd
- Content-Type to application/json
- This is CORRECT for transparent transfer encoding
- BUT the review comment question remains: what if you want to store a JSON file and serve it with transfer compression? The code handles this with network_compression_applied, which is correct.
- The code logic actually IS correct according to Azure semantics
[pending]- Verify the actual constant values and header setting logic
- Check if there are any issues in edge cases
[actions]- Search codebase to verify constant definitions and actual header setting
[done]
[scratchpad_start] -->Based on the web search results provided and the code logic shown in the review comment, I can now verify the HTTP header semantics.
When uploading compressed JSON, the pattern shown uses Content-Type: "application/csv" and Content-Encoding: "gzip", which aligns with the network compression logic. For gzip-compressed JSON blobs, setting Content-Type: "application/json" with Content-Encoding: "gzip" is the correct approach.
For example, CSS files should use Content-Type: "text/css" with Content-Encoding: "gzip", which shows the pattern: the original Content-Type paired with the transfer encoding.
The code's HTTP header semantics are correct. The distinction is properly implemented:
- Blob compression (stored format): Content-Type indicates the file format (gzip/zstd), no Content-Encoding
- Network compression (transfer encoding): Content-Encoding indicates the encoding, Content-Type describes the uncompressed content
No issues were found in the verification.
141-143: Verify null-termination behavior when compression fails.The null-termination is skipped when
ctx->compression != NONE, but compression may fail later inhttp_send_blob. If compression fails and the payload is sent uncompressed, the buffer will lack null-termination.This is likely safe because the HTTP client uses the explicit
body_sizerather than relying on null-termination, but it's worth verifying that all downstream consumers handle this correctly.
Signed-off-by: Nico Berlee <[email protected]>
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
debug.log
valgrind.log
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Improvements
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.