Skip to content

Commit

Permalink
add options to make reading ZIP archives less strict
Browse files Browse the repository at this point in the history
  • Loading branch information
KurtThiemann committed Dec 10, 2024
1 parent 4ed35d0 commit f95bf80
Show file tree
Hide file tree
Showing 13 changed files with 130 additions and 34 deletions.
20 changes: 15 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,11 +58,13 @@ await archive.init();
The ReadArchive constructor optionally accepts an [ReadArchiveOptions](src/Options/ReadArchiveOptions.js) object with
the following properties:

| Name | Type | Description |
|------------------------------|---------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `centralDirectoryBufferSize` | number | Buffer size used when reading central directory contents.<br/>Larger buffer sizes may improve performance, but also increase RAM usage. |
| `createEntryIndex` | boolean | Whether an index of all central directory entries should be created the first time they are read.<br/>Massively increases performance when using `findEntry` multiple times. |
| `entryOptions` | [EntryOptions](src/Options/EntryOptions.js) | Options passed to each created Entry object. |
| Name | Type | Description |
|----------------------------------|---------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `centralDirectoryBufferSize` | number | Buffer size used when reading central directory contents.<br/>Larger buffer sizes may improve performance, but also increase RAM usage. |
| `createEntryIndex` | boolean | Whether an index of all central directory entries should be created the first time they are read.<br/>Massively increases performance when using `findEntry` multiple times. |
| `entryOptions` | [EntryOptions](src/Options/EntryOptions.js) | Options passed to each created Entry object. |
| `ignoreMultiDiskErrors` | boolean | Simply ignore information about multiple disks instead of throwing an error when encountering a multi disk archive |
| `allowTruncatedCentralDirectory` | boolean | Do not throw an error if the central directory does not contain the expected numbber of entries |

[EntryOptions](src/Options/EntryOptions.js) can have the following properties:

Expand Down Expand Up @@ -133,6 +135,14 @@ file. Since this data is decompressed, the size of the returned chunk might diff
Also note that an empty chunk returned from `EntryDataReader.read` does not necessarily indicate that all data has been read.
After all data was read, `null` will be returned instead.

Both `getDataReader` and `getData` optionally accept an [EntryDataReaderOptions](src/Options/EntryDataReaderOptions.js) object with
the following properties:

| Name | Type | Description |
|---------------------------------|---------|---------------------------------------------------------------------------------|
| `ignoreInvalidChecksums` | boolean | Do not throw an error if the uncompressed data does not match the checksum |
| `ignoreInvalidUncompressedSize` | boolean | Do not throw an error if the uncompressed data does not match the expected size |

### Writing archives

New archives can be created using a [WriteArchive](src/Archive/WriteArchive.js) object.
Expand Down
2 changes: 2 additions & 0 deletions index-browser.js
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,13 @@ export { default as UnicodeExtraField } from "./src/Archive/Structure/ExtraField

export { default as CP437 } from "./src/Util/CP437.js";
export { default as MsDosTime } from "./src/Util/MsDosTime.js";
export { default as Resize } from "./src/Util/Resize.js";

export { default as Options } from "./src/Options/Options.js";
export { default as EntrySourceOptions } from "./src/Options/EntrySourceOptions.js";
export { default as ReadArchiveOptions } from "./src/Options/ReadArchiveOptions.js";
export { default as WriteArchiveOptions } from "./src/Options/WriteArchiveOptions.js";
export { default as EntryDataReaderOptions } from "./src/Options/EntryDataReaderOptions.js";

export { default as ArmariusError } from "./src/Error/ArmariusError.js";
export { default as ChecksumError } from "./src/Error/ChecksumError.js";
Expand Down
2 changes: 2 additions & 0 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,13 @@ export { default as UnicodeExtraField } from "./src/Archive/Structure/ExtraField

export { default as CP437 } from "./src/Util/CP437.js";
export { default as MsDosTime } from "./src/Util/MsDosTime.js";
export { default as Resize } from "./src/Util/Resize.js";

export { default as Options } from "./src/Options/Options.js";
export { default as EntrySourceOptions } from "./src/Options/EntrySourceOptions.js";
export { default as ReadArchiveOptions } from "./src/Options/ReadArchiveOptions.js";
export { default as WriteArchiveOptions } from "./src/Options/WriteArchiveOptions.js";
export { default as EntryDataReaderOptions } from "./src/Options/EntryDataReaderOptions.js";

export { default as ArmariusError } from "./src/Error/ArmariusError.js";
export { default as ChecksumError } from "./src/Error/ChecksumError.js";
Expand Down
38 changes: 22 additions & 16 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "armarius",
"version": "2.1.2",
"version": "2.2.0",
"description": "A JavaScript library to read, write, and merge ZIP archives in web browsers.",
"repository": "github:aternosorg/armarius",
"type": "module",
Expand Down
25 changes: 21 additions & 4 deletions src/Archive/Entry/ArchiveEntry.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ import EntryOptions from '../../Options/EntryOptions.js';
import {BigInt, CRC32} from 'armarius-io';
import FeatureError from '../../Error/FeatureError.js';
import ArmariusError from '../../Error/ArmariusError.js';
import ZipError from '../../Error/ZipError.js';
import Resize from '../../Util/Resize.js';

const decoder = new TextDecoder();

Expand Down Expand Up @@ -230,35 +232,50 @@ export default class ArchiveEntry {
}

/**
* @param {EntryDataReaderOptions|EntryDataReaderOptionsObject} options
* @returns {Promise<EntryDataReader>}
*/
async getDataReader() {
async getDataReader(options = {}) {
if (this.isDirectory()) {
throw new ArmariusError(`Cannot create data reader: ${this.getFileNameString()} is a directory`);
}
await this.readLocalFileHeader();
return new EntryDataReader(
await this.getDataProcessor(),
this.centralDirectoryFileHeader.crc32,
Number(this.getUncompressedSize())
Number(this.getUncompressedSize()),
options
);
}

/**
* @param {number} chunkSize
* @param {EntryDataReaderOptions|EntryDataReaderOptionsObject} options
* @returns {Promise<Uint8Array>}
*/
async getData(chunkSize = 1024 * 64) {
async getData(chunkSize = 1024 * 64, options = {}) {
let res = new Uint8Array(Number(this.getUncompressedSize()));
let offset = 0;
let reader = await this.getDataReader();
let reader = await this.getDataReader(options);

let chunk;
while ((chunk = await reader.read(chunkSize)) !== null) {
if (offset + chunk.byteLength > res.byteLength) {
if (!options.ignoreInvalidUncompressedSize) {
throw new ZipError(`Data size exceeds expected uncompressed size.`);
}
res = Resize.resizeBuffer(res, res.byteLength + chunkSize * 8);
}
res.set(chunk, offset);
offset += chunk.byteLength;
}

if (offset < res.byteLength) {
if (!options.ignoreInvalidUncompressedSize) {
throw new ZipError(`Data size is less than expected uncompressed size.`);
}
return new Uint8Array(res.buffer, res.byteOffset, offset);
}
return res;
}

Expand Down
10 changes: 8 additions & 2 deletions src/Archive/Entry/EntryDataReader.js
Original file line number Diff line number Diff line change
@@ -1,23 +1,27 @@
import ChecksumError from '../../Error/ChecksumError.js';
import {DataStream} from 'armarius-io';
import EntryDataReaderOptions from '../../Options/EntryDataReaderOptions.js';

export default class EntryDataReader extends DataStream {
/** @type {import("armarius-io").DataProcessor} */ dataProcessor;
/** @type {import("armarius-io").CRC32} */ crc32;
/** @type {number} */ expectedCrc32;
/** @type {number} */ expectedSize;
/** @type {number} */ offset = 0;
/** @type {EntryDataReaderOptions} */ options;

/**
* @param {import("armarius-io").DataProcessor} dataProcessor
* @param {number} expectedCrc32
* @param {number} expectedSize
* @param {EntryDataReaderOptions|EntryDataReaderOptionsObject} options
*/
constructor(dataProcessor, expectedCrc32, expectedSize) {
constructor(dataProcessor, expectedCrc32, expectedSize, options) {
super();
this.dataProcessor = dataProcessor;
this.expectedCrc32 = expectedCrc32;
this.expectedSize = expectedSize;
this.options = EntryDataReaderOptions.from(options);
}

/**
Expand All @@ -44,7 +48,9 @@ export default class EntryDataReader extends DataStream {

if (this.dataProcessor.getPostCrc()) {
if (eof && this.dataProcessor.getPostCrc().finish() !== this.expectedCrc32) {
throw new ChecksumError('CRC32 checksum does not match expected value');
if (!this.options.ignoreInvalidChecksums) {
throw new ChecksumError(`CRC32 checksum does not match expected value. Expected ${this.expectedCrc32}, got ${this.dataProcessor.getPostCrc().finish()}`);
}
}
}

Expand Down
9 changes: 9 additions & 0 deletions src/Archive/Entry/EntryIterator.js
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
import ArchiveIndex from "../../Index/ArchiveIndex.js";
import ArchiveEntry from "./ArchiveEntry.js";
import {BigInt} from 'armarius-io';
import ZipError from '../../Error/ZipError.js';

export default class EntryIterator {
/** @type {ReadArchive} */ archive;
/** @type {BigInt} */ entryCount;
/** @type {BigInt} */ size;
/** @type {number} */ startOffset;
/** @type {BigInt} */ currentEntry;
/** @type {import("armarius-io").IO} */ io;
Expand All @@ -22,6 +24,7 @@ export default class EntryIterator {
this.createIndex = createIndex;
this.startOffset = io.offset;
this.entryCount = archive.getCentralDirectoryEntryCount();
this.size = archive.getCentralDirectorySize();
this.reset();
}

Expand Down Expand Up @@ -53,6 +56,12 @@ export default class EntryIterator {
if (this.currentEntry >= this.entryCount) {
return null;
}
if (this.io.offset >= BigInt(this.startOffset) + this.size) {
if (!this.archive.options.allowTruncatedCentralDirectory) {
throw new ZipError("Reached end of central directory data before all entries were read");
}
return null;
}
let offset = this.io.offset;
let entry = await ArchiveEntry.load(this.archive, this.io, offset);
this.currentEntry++;
Expand Down
11 changes: 8 additions & 3 deletions src/Archive/ReadArchive.js
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,9 @@ export default class ReadArchive {
this.prependedDataLength = 0;

if(this.endOfCentralDirectoryRecord.diskNumber !== 0 || this.endOfCentralDirectoryRecord.centralDirectoryDiskNumber !== 0) {
throw new FeatureError('Multi disk archives are not supported');
if (!this.options.ignoreMultiDiskErrors) {
throw new FeatureError('Multi disk archives are not supported');
}
}

/*
Expand Down Expand Up @@ -134,7 +136,7 @@ export default class ReadArchive {
);
this.endOfCentralDirectoryLocator64 = await EndOfCentralDirectoryLocator64.fromIO(endOfDirectoryLocatorReader);

if(this.endOfCentralDirectoryLocator64.disks > 1) {
if(this.endOfCentralDirectoryLocator64.disks > 1 && !this.options.ignoreMultiDiskErrors) {
throw new FeatureError('Multi disk archives are not supported');
}

Expand Down Expand Up @@ -167,7 +169,10 @@ export default class ReadArchive {

if(this.endOfCentralDirectoryRecord64.diskNumber !== 0 ||
this.endOfCentralDirectoryRecord64.centralDirectoryDiskNumber !== 0) {
throw new FeatureError('Multi disk archives are not supported');

if (!this.options.ignoreMultiDiskErrors) {
throw new FeatureError('Multi disk archives are not supported');
}
}

this.centralDirectoryByteLength = Number(this.endOfCentralDirectoryRecord64.centralDirectorySize);
Expand Down
13 changes: 13 additions & 0 deletions src/Options/EntryDataReaderOptions.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import Options from './Options.js';

/**
* @typedef {Object} EntryDataReaderOptionsObject
* @property {boolean} [ignoreInvalidChecksums]
* @property {boolean} [ignoreInvalidUncompressedSize]
*/

export default class EntryDataReaderOptions extends Options {
/** @type {boolean} */ ignoreInvalidChecksums = false;
/** @type {boolean} */ ignoreInvalidUncompressedSize = false;
}

4 changes: 2 additions & 2 deletions src/Options/EntryOptions.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@ defaultDataProcessors.set(Constants.COMPRESSION_METHOD_DEFLATE, DefaultInflateDa

/**
* @typedef {Object} EntryOptionsObject
* @property {Map<number, typeof DataProcessor>} [dataProcessors]
* @property {Map<number, typeof import("armarius-io").DataProcessor>} [dataProcessors]
*/

export default class EntryOptions extends Options {
/** @type {Map<number, typeof DataProcessor>} */ dataProcessors = defaultDataProcessors;
/** @type {Map<number, typeof import("armarius-io").DataProcessor>} */ dataProcessors = defaultDataProcessors;
}

6 changes: 5 additions & 1 deletion src/Options/ReadArchiveOptions.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,16 @@ import Constants from '../Constants.js';
* @property {number} [centralDirectoryBufferSize]
* @property {boolean} [createEntryIndex]
* @property {EntryOptions|EntryOptionsObject} [entryOptions]
* @property {boolean} [ignoreMultiDiskErrors]
* @property {boolean} [allowTruncatedCentralDirectory]
*/


export default class ReadArchiveOptions extends Options{
export default class ReadArchiveOptions extends Options {
/** @type {number} */ centralDirectoryBufferSize = Constants.DEFAULT_CHUNK_SIZE;
/** @type {boolean} */ createEntryIndex = true;
/** @type {EntryOptions|EntryOptionsObject} */ entryOptions = {};
/** @type {boolean} */ ignoreMultiDiskErrors = false;
/** @type {boolean} */ allowTruncatedCentralDirectory = false;
}

Loading

0 comments on commit f95bf80

Please sign in to comment.