v1.4.0
Tar v1.4.0
This release adds two new API functions:
rewrite
allows a tarball to be rewritten to the standard form thatcreate
produces without needing to fully extract and re-create the tarball. If the input stream is seekable, it makes one pass to index the tarball and uses seek to access file data in the correct order. If the input stream is not seekable, it will collect all the data in a buffer first and then use that.tree_hash
compute the git tree hash (SHA1 and SHA256 supported) of a tarball without needing to extract it to disk. This is particularly useful since some file systems lack features that are significant to git when hashing a file tree (e.g. symlinks, case preservation, ability to set/get executable bits).
This also includes significant refactoring of the internal {read,write}_tarball
functions. This refactoring allows read_tarball
to by extract
, rewrite
and tree_hash
while write_tarball
is used by create
and rewrite
. In the future these internal functions may be promoted to official low-level APIs.
Closed issues:
Merged pull requests:
- implement Tar.tree_hash (with other improvements) (#36) (@StefanKarpinski)
- tree_hash: more efficient file hashing (#37) (@StefanKarpinski)
- README: add note about reproducibility and tree_hash (#39) (@StefanKarpinski)
- read_data: fix premature eof logic (#40) (@StefanKarpinski)
- format change: sort purely by name, no '/' added to dirs (#42) (@StefanKarpinski)
- create.jl: factor out reusable write_tarball function (#43) (@StefanKarpinski)
- new API: Tar.rewrite([pred], old, [new]) (#44) (@StefanKarpinski)
- list_tarball: specialize on read_hdr function (#45) (@StefanKarpinski)
- skip(Process): use buffered I/O when skipping process output (#46) (@StefanKarpinski)
- open_{read,write}: use generic
file
argument name (#47) (@StefanKarpinski)