[C++] Lz4HadoopCodec::Compress writes single oversized block incompatible with Hadoop Lz4Decompressor

### Describe the bug

`Lz4HadoopCodec::Compress` writes the entire input as a single Hadoop-framed LZ4 block regardless of size. Hadoop's `Lz4Decompressor` allocates a fixed 256 KiB output buffer per block (`IO_COMPRESSION_CODEC_LZ4_BUFFERSIZE_DEFAULT = 256 * 1024`), so any block whose decompressed size exceeds 256 KiB causes `LZ4Exception` on JVM readers (parquet-mr + Hadoop `BlockDecompressorStream`).

PARQUET-1878 added `Lz4HadoopCodec` but writes one block per page. ARROW-11301 fixed the *reader* for multi-block Hadoop data, but the *writer* was never updated to split large inputs the same way Hadoop's `BlockCompressorStream` does.

### Steps to reproduce

Write a Parquet file with `LZ4_HADOOP` compression containing a dictionary page >256 KiB (e.g. 40K unique INT64 values = 320 KiB), then read it with a JVM-based Parquet reader (parquet-mr + Hadoop).

### Expected behavior

The file should be readable by JVM-based Parquet readers.

### Actual behavior

```
net.jpountz.lz4.LZ4Exception: Error decoding offset 131193 of input buffer
  at net.jpountz.lz4.LZ4JNISafeDecompressor.decompress(LZ4JNISafeDecompressor.java:71)
  at org.apache.hadoop.io.compress.lz4.Lz4Decompressor.decompressDirectBuf(Lz4Decompressor.java:278)
  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:88)
  ...
```

### Severity

Read failure, not data corruption. The bytes on disk are valid LZ4 — Arrow's own C++ reader handles them fine. The JVM reader throws a hard exception; it does not return wrong data.

### Component(s)

C++

### Related issues

ARROW-9177, PARQUET-1878, ARROW-11301

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C++] Lz4HadoopCodec::Compress writes single oversized block incompatible with Hadoop Lz4Decompressor #49641

Describe the bug

Steps to reproduce

Expected behavior

Actual behavior

Severity

Component(s)

Related issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[C++] Lz4HadoopCodec::Compress writes single oversized block incompatible with Hadoop Lz4Decompressor #49641

Description

Describe the bug

Steps to reproduce

Expected behavior

Actual behavior

Severity

Component(s)

Related issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions