Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Apache ORC provides both C++ and Java libraries for reading and writing ORC files, which are widely used by major data processing frameworks. Also, ORC-based formats are increasingly used to store AI training datasets at large scale.
Motivation
This integration follows the recent discovery of CVE-2025-47436, a heap buffer overflow vulnerability in the C++ LZO decompressor affecting Apache ORC versions. The vulnerability occurs when specially crafted malformed ORC files can result in memory corruption.
Continuous fuzzing through OSS-Fuzz will help identify similar input validation vulnerabilities earlier and improve the robustness of the ORC file parser.
Project Details
This initial PR includes only the
project.yaml
configuration. The build infrastructure will be added in a follow-up PR after your approval.