Support compressed multiline JSON #13

mikix · 2025-02-12T14:29:21Z

NDJSON files can get pretty big - and they are extremely compressible.

The multi-line JSON methods should gain the ability to recognize .gz and .zip at least. Maybe .xz and .bz2 too (others?). And then pass them through Python's built-in de-compressors. I don't think Python has a generic layer for this already, so this might require a little abstraction code.

Some stats on the various linuxy compression approaches, for the curious. Seems like when taking time into account, gzip is "good enough".

The text was updated successfully, but these errors were encountered:

mikix added the enhancement New feature or request label Feb 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support compressed multiline JSON #13

Support compressed multiline JSON #13

mikix commented Feb 12, 2025 •

edited

Loading

Support compressed multiline JSON #13

Support compressed multiline JSON #13

Comments

mikix commented Feb 12, 2025 • edited Loading

mikix commented Feb 12, 2025 •

edited

Loading