Release Zstandard v1.5.1 · facebook/zstd

Notice : it has been brought to our attention that the v1.5.1 library might be built with an executable stack on non-x64 architectures, which could end up being flagged as problematic by some systems with thorough security settings which disallow executable stack. We are currently reviewing the issue. Be aware of it if you build libzstd for non-x64 architecture.

Zstandard v1.5.1 is a maintenance release, bringing a good number of small refinements to the project. It also offers a welcome crop of performance improvements, as detailed below.

Performance Improvements

Speed improvements for fast compression (levels 1–4)

PRs #2749, #2774, and #2921 refactor single-segment compression for ZSTD_fast and ZSTD_dfast, which back compression levels 1 through 4 (as well as the negative compression levels). Speedups in the ~3-5% range are observed. In addition, the compression ratio of ZSTD_dfast (levels 3 and 4) is slightly improved.

Rebalanced middle compression levels

v1.5.0 introduced major speed improvements for mid-level compression (from 5 to 12), while preserving roughly similar compression ratio. As a consequence, the speed scale became tilted towards faster speed. Unfortunately, the difference between successive levels was no longer regular, and there is a large performance gap just after the impacted range, between levels 12 and 13.

v1.5.1 tries to rebalance parameters so that compression levels can be roughly associated to their former speed budget. Consequently, v1.5.1 mid compression levels feature speeds closer to former v1.4.9 (though still sensibly faster) and receive in exchange an improved compression ratio, as shown in below graph.

Note that, since middle levels only experience a rebalancing, save some special cases, no significant performance differences between versions v1.5.0 and v1.5.1 should be expected: levels merely occupy different positions on the same curve. The situation is a bit different for fast levels (1-4), for which v1.5.1 delivers a small but consistent performance benefit on all platforms, as described in previous paragraph.

Huffman Improvements

Our Huffman code was significantly revamped in this release. Both encoding and decoding speed were improved. Additionally, encoding speed for small inputs was improved even further. Speed is measured on the Silesia corpus by compressing with level 1 and extracting the literals left over after compression. Then compressing and decompressing the literals from each block. Measurements are done on an Intel i9-9900K @ 3.6 GHz.

Compiler	Scenario	v1.5.0 Speed	v1.5.1 Speed	Delta
gcc-11	Literal compression - 128KB block	748 MB/s	927 MB/s	+23.9%
clang-13	Literal compression - 128KB block	810 MB/s	927 MB/s	+14.4%
gcc-11	Literal compression - 4KB block	223 MB/s	321 MB/s	+44.0%
clang-13	Literal compression - 4KB block	224 MB/s	310 MB/s	+38.2%
gcc-11	Literal decompression - 128KB block	1164 MB/s	1500 MB/s	+28.8%
clang-13	Literal decompression - 128KB block	1006 MB/s	1504 MB/s	+49.5%

Overall impact on (de)compression speed depends on the compressibility of the data. Compression speed improves from 1-4%, and decompression speed improves from 5-15%.

PR #2722 implements the Huffman decoder in assembly for x86-64 with BMI2 enabled. We detect BMI2 support at runtime, so this speedup applies to all x86-64 builds running on CPUs that support BMI2. This improves Huffman decoding speed by about 40%, depending on the scenario. PR #2733 improves Huffman encoding speed by 10% for clang and 20% for gcc. PR #2732 drastically speeds up the HUF_sort() function, which speeds up Huffman tree building for compression. This is a significant speed boost for small inputs, measuring in at a 40% improvement for 4K inputs.

Binary Size and Build Speed

zstd binary size grew significantly in v1.5.0 due to the new code added for middle compression level speed optimizations. In this release we recover the binary size, and in the process also significantly speed up builds, especially with sanitizers enabled.

Measured on x86-64 compiled with -O3 we measure libzstd.a size. We regained 161 KB of binary size on gcc, and 293 KB of binary size on clang. Note that these binary sizes are listed for the whole library, optimized for speed over size. The decoder only, with size saving options enabled, and compiled with -Os or -Oz can be much smaller.

Version	gcc-11 size	clang-13 size
v1.5.1	1177 KB	1167 KB
v1.5.0	1338 KB	1460 KB
v1.4.9	1137 KB	1151 KB

Change log

Featured user-visible changes

perf: rebalanced compression levels, to better match intended speed/level curve, by @senhuang42 and @Cyan4973
perf: faster huffman decoder, using x64 assembly, by @terrelln
perf: slightly faster high speed modes (strategies fast & dfast), by @felixhandte
perf: smaller binary size and faster compilation times, by @terrelln and @nolange
perf: new row64 mode, used notably at highest lazy2 levels 11-12, by @senhuang42
perf: faster mid-level compression speed in presence of highly repetitive patterns, by @senhuang42
perf: minor compression ratio improvements for small data at high levels, by @Cyan4973
perf: reduced stack usage (mostly useful for Linux Kernel), by @terrelln
perf: faster compression speed on incompressible data, by @bindhvo
perf: on-demand reduced ZSTD_DCtx state size, using build macro ZSTD_DECODER_INTERNAL_BUFFER, at a small cost of performance, by @bindhvo
build: allows hiding static symbols in the dynamic library, using build macro, by @skitt
build: support for m68k (Motorola 68000's), by @Cyan4973
build: improved AIX support, by @Helflym
build: improved meson unofficial build, by @eli-schwartz
cli : fix : forward mtime to output file, by @felixhandte
cli : custom memory limit when training dictionary (#2925), by @embg
cli : report advanced parameters information when compressing in very verbose mode (-vv), by @Svetlitski-FB
cli : advanced commands in the form --long-param= can accept negative value arguments, by @binhdvo

PR full list

Add determinism fuzzers and fix rare determinism bugs by @terrelln in #2648
ZSTD_VecMask_next: fix incorrect variable name in fallback code path by @dnelson-1901 in #2657
improve tar compatibility by @Cyan4973 in #2660
Enable SSE2 compression path to work on MSVC by @TrianglesPCT in #2653
Fix CircleCI Config to Fully Remove publish-github-release Job by @felixhandte in #2649
[CI] Fix zlib-wrapper test by @senhuang42 in #2668
[CI] Add ARM tests back into CI by @senhuang42 in #2667
[trace] Refine the ZSTD_HAVE_WEAK_SYMBOLS detection by @terrelln in #2674
[CI][1/2] Re-do the github actions workflows, migrate various travis and appveyor tests. by @senhuang42 in #2675
Make GH Actions CI tests run apt-get update before apt-get install by @senhuang42 in #2682
Add arm64 fuzz test to travis by @senhuang42 in #2686
Add ldm and block splitter auto-enable to old api by @senhuang42 in #2684
Add documentation for --patch-from by @binhdvo in #2693
Make regression test run on every PR by @senhuang42 in #2691
Initialize "potentially uninitialized" pointers. by @wolfpld in #2654
Flatten ZSTD_row_getMatchMask by @aqrit in #2681
Update README for Travis CI Badge by @gauthamkrishna9991 in #2700
Fuzzer test with no intrinsics on S390x (big endian) by @senhuang42 in #2678
Fix --progress flag to properly control progress display and default … by @binhdvo in #2698
[bug] Fix entropy repeat mode bug by @senhuang42 in #2697
Format File Sizes Human-Readable in the cli by @felixhandte in #2702
Add support for negative values in advanced flags by @binhdvo in #2705
[fix] Add missing bounds checks during compression by @terrelln in #2709
Add API for fetching skippable frame content by @binhdvo in #2708
Add option to use logical cores for default threads by @binhdvo in #2710
lib/Makefile: Fix small typo in ZSTD_FORCE_DECOMPRESS_* build macros by @luisdallos in #2714
[RFC] Add internal API for converting ZSTD_Sequence into seqStore by @senhuang42 in #2715
Optimize zstd decompression by another x% by @danlark1 in #2689
Include what you use in zstd_ldm_geartab by @danlark1 in #2719
[trace] remove zstd_trace.c reference from freestanding by @heitbaum in #2655
Remove folder when done with test by @senhuang42 in #2720
Proactively skip huffman compression based on sampling where non-comp… by @binhdvo in #2717
Add support for MCST LCC compiler by @makise-homura in #2725
[bug-fix] Fix a determinism bug with the DUBT by @terrelln in #2726
Fix DDSS Load by @felixhandte in #2729
Z_PREFIX zError function by @koalabearguo in #2707
pzstd: fix linking for static builds by @jonringer in #2724
[HUF] Improve Huffman encoding speed by @terrelln in #2733
[HUF] Improve Huffman sorting algorithm by @senhuang42 in #2732
Set mtime on Output Files by @felixhandte in #2742
[RFC] Rebalance compression levels by @senhuang42 in #2692
Improve branch misses on FSE symbol spreading by @senhuang42 in #2750
make ZSTD_HASHLOG3_MAX private by @Cyan4973 in #2752
meson fixups by @eli-schwartz in #2746
[easy] Fix zstd bench error message by @senhuang42 in #2753
Reduce test time on TravisCI by @Cyan4973 in #2757
added qemu tests by @Cyan4973 in #2758
Add 8 bytes to FSE_buildCTable wksp by @senhuang42 in #2761
minor rebalancing of level 13 by @Cyan4973 in #2762
Improve compile speed and binary size in opt by @senhuang42 in #2763
[easy] Fix patch-from help msg typo by @senhuang42 in #2769
Pipelined Implementation of ZSTD_fast (~+5% Speed) by @felixhandte in #2749
meson: fix type error for integer option by @eli-schwartz in #2775
Fix dictionary training huffman segfault and small speed improvement by @senhuang42 in #2773
[rsyncable] Ensure ZSTD_compressBound() is respected by @terrelln in #2776
Improve optimal parser performance on small data by @Cyan4973 in #2771
[rsyncable] Fix test failures by @terrelln in #2777
Revert opt outlining change by @senhuang42 in #2778
[build] Add support for ASM files in Make + CMake by @terrelln in #2783
add msvc2019 to build.generic.cmd by @animalize in #2787
[fuzzer] Add Huffman decompression fuzzer by @terrelln in #2784
Assembly implementation of 4X1 & 4X2 Huffman by @terrelln in #2722
[huf] Fix compilation when DYNAMIC_BMI2=0 && BMI2 is supported by @terrelln in #2791
Use new paramSwitch enum for row matchfinder and block splitter by @senhuang42 in #2788
Fix NCountWriteBound by @senhuang42 in #2779
[contrib][linux] Fix up SPDX license identifiers by @terrelln in #2794
[contrib][linux] Reduce stack usage by 80 bytes by @terrelln in #2795
Reduce stack usage of block splitter by @senhuang42 in #2780
minor: constify MatchState* parameter when possible by @Cyan4973 in #2797
[build] Fix oss-fuzz build with the dataflow sanitizer by @terrelln in #2799
[lib] Make lib compatible with -Wfall-through excepting legacy by @terrelln in #2796
[contrib][linux] Fix build after introducing ASM HUF implementation by @solbjorn in #2790
Smaller code with disabled features by @nolange in #2805
[huf] Fix OSS-Fuzz assert by @terrelln in #2808
Skip most long matches in lazy hash table update by @senhuang42 in #2755
add missing BUNDLE DESTINATION by @3nids in #2810
[contrib][linux] Fix -Wundef inside Linux kernel tree by @solbjorn in #2802
[contrib][linux-kernel] Add standard warnings and -Werror to CI by @terrelln in #2803
Add AIX support in Makefile by @Helflym in #2747
Limit train samples by @stanjo74 in #2809
[multiple-ddicts] Fix NULL checks by @terrelln in #2817
[ldm] Fix ZSTD_c_ldmHashRateLog bounds check by @terrelln in #2819
[binary-tree] Fix underflow of nbCompares by @terrelln in #2820
Enhance streaming_compression examples. by @marxin in #2813
Pipelined Implementation of ZSTD_dfast by @felixhandte in #2774
Fix a C89 error in msvc by @animalize in #2800
[asm] Switch to C style comments by @terrelln in #2825
Support thread pool section in HTML documentation. by @marxin in #2822
Reduce size of dctx by reutilizing dst buffer by @binhdvo in #2751
[lazy] Speed up compilation times by @terrelln in #2828
separate compression level tables into their own file by @Cyan4973 in #2830
minor : change build macro to ZSTD_DECODER_INTERNAL_BUFFER by @Cyan4973 in #2829
Fix oss fuzz test error by @binhdvo in #2837
Move mingw tests from appveyor to github actions by @binhdvo in #2838
Improvements to verbose mode output by @Svetlitski-FB in #2839
Use unused functions to appease Visual Studio by @senhuang42 in #2846
Backport zstd patch from LKML by @terrelln in #2849
Fix fullbench CI failure by @binhdvo in #2851
Fix Determinism Bug: Avoid Reducing Indices to Reserved Values by @felixhandte in #2850
ZSTD_copy16() uses ZSTD_memcpy() by @animalize in #2836
Display command line parameters with concrete values in verbose mode by @Svetlitski-FB in #2847
Reduce function size in fast & dfast by @terrelln in #2863
[linux-kernel] Don't inline function in zstd_opt.c by @terrelln in #2864
Remove executable flag from GNU_STACK segment by @ko-zu in #2857
[linux-kernel] Don't add -O3 to CFLAGS by @terrelln in #2866
Support Swift Package Manager by @cntrump in #2858
Determinism: Avoid Mapping Window into Reserved Indices during Reduction by @felixhandte in #2869
Clarify documentation for -c by @binhdvo in #2883
Fix build for cygwin/bsd by @binhdvo in #2882
Move visual studio tests from per-release to per-PR by @senhuang42 in #2845
Fix SPM warning: umbrella header for module 'libzstd' does not include header 'xxx.h' by @cntrump in #2872
Add detection when compiling with Clang and Ninja under Windows by @jannkoeker in #2877
[contrib][pzstd] Fix build issue with gcc-5 by @terrelln in #2889
[bmi2] Add lzcnt and bmi target attributes by @terrelln in #2888
[test] Test that the exec-stack bit isn't set on libzstd.so by @terrelln in #2886
Solve the bug of extra output newline character by @15596858998 in #2876
[zdict] Remove ZDICT_CONTENTSIZE_MIN restriction for ZDICT_finalizeDictionary by @terrelln in #2887
Explicitly hide static symbols by @skitt in #2501
Makefile: sort all wildcard file list expansions by @kanavin in #2895
merge #2501 by @Cyan4973 in #2894
Makefile: fix build for mingw by @sapiippo in #2687
[CircleCI] Fix short-tests-0 by @terrelln in #2892
Zstandard compiles and run on m68k cpus by @Cyan4973 in #2896
Improve zstd_opt build speed and size by @terrelln in #2898
[CI] Add cmake windows build by @terrelln in #2900
Disable Multithreading in CMake Builds for Android by @felixhandte in #2899
Avoid Using Deprecated Functions in Deprecated Code by @felixhandte in #2897
[asm] Share portability macros and restrict ASM further by @terrelln in #2893
fixbug CLI's -D fails when the argument is not a regular file by @15596858998 in #2890
Apply FORCE_MEMORY_ACCESS=1 to legacy by @Hello71 in #2907
[lib] Fix libzstd.pc for lib-mt builds by @ericonr in #2659
Imply -q when stderr is not a tty by @binhdvo in #2884
Fix Up #2659; Build libzstd.pc Whenever Building the Lib on Unix by @felixhandte in #2912
Remove possible NULL pointer addition by @terrelln in #2916
updated xxHash to latest v0.8.1 by @Cyan4973 in #2914
Reject Irregular Dictionary Files by @felixhandte in #2910
x32 compatibility by @Cyan4973 in #2922
typo: Small spelling mistake in example by @IAL32 in #2923
add test case by @15596858998 in #2905
Stagger Stepping in Negative Levels by @felixhandte in #2921
Fix performance degradation with -m32 by @binhdvo in #2926
Reduce tables to 8bit by @nolange in #2930
simplify SSE implementation of row_lazy match finder by @Cyan4973 in #2929
Allow user to specify memory limit for dictionary training by @embg in #2925
fixed incorrect rowlog initialization by @Cyan4973 in #2931
rebalance lazy compression levels by @Cyan4973 in #2934

New Contributors

@dnelson-1901 made their first contribution in #2657
@TrianglesPCT made their first contribution in #2653
@binhdvo made their first contribution in #2693
@wolfpld made their first contribution in #2654
@aqrit made their first contribution in #2681
@gauthamkrishna9991 made their first contribution in #2700
@luisdallos made their first contribution in #2714
@danlark1 made their first contribution in #2689
@heitbaum made their first contribution in #2655
@makise-homura made their first contribution in #2725
@koalabearguo made their first contribution in #2707
@jonringer made their first contribution in #2724
@eli-schwartz made their first contribution in #2746
@abxhr made their first contribution in #2798
@solbjorn made their first contribution in #2790
@nolange made their first contribution in #2805
@3nids made their first contribution in #2810
@Helflym made their first contribution in #2747
@stanjo74 made their first contribution in #2809
@Svetlitski-FB made their first contribution in #2839
@cntrump made their first contribution in #2858
@rex4539 made their first contribution in #2856
@jannkoeker made their first contribution in #2877
@yoniko made their first contribution in #2885
@15596858998 made their first contribution in #2876
@kanavin made their first contribution in #2895
@sapiippo made their first contribution in #2687
@supperPants made their first contribution in #2891
@Hello71 made their first contribution in #2907
@ericonr made their first contribution in #2659
@IAL32 made their first contribution in #2923
@embg made their first contribution in #2925

Full Changelog: v1.5.0...v1.5.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zstandard v1.5.1