-
Notifications
You must be signed in to change notification settings - Fork 977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add async support on ZipOutputStream #547
Conversation
This is a very rough first pass at adding async support to ZipOutputStream. All method doing synchronous stream writes have been duplicated with an asynchronous counterpart, starting from `ZipOutputStream.PutNextEntryAsync()`. Unfortunately, this makes a lot of duplicated code but is unavoidable to have both sync + async code paths. .NET Standard 2.1 has been added to the target frameworks. It's required to get full async support with `Stream.DisposeAsync()`. New unit tests must be written to cover the new async code paths. Fixes icsharpcode#223
I started adapting some of the existing sync unit tests in #457 (only the deflate ones to start with) with the idea that those could be a base for testing async changes later (as there can be tests that call the async stream functions before implementing said async functions in the zip/gzip streams), but there is more work to do there for the zip/gzip subclass streams.
hopefully there is some scope for refactoring things to avoid duplicating everything! |
.NET Standard 2.1 is perhaps something that can be considered seperately, as you can add it seperately from the other code changes, and that also opens up the possibility of doing things with the Span based apis and such as well? |
I don't think it's possible to consider it separately from the async stream feature because Stream.DisposeAsync() was introduced in .NET Standard 2.1. If you want to totally avoid synchronous stream writes (typically for writing to an ASP.NET Core HTTP response body), you must also support await using var zipStream = new ZipOutputStream(httpContext.Response.Body);
await zipStream.PutNextEntryAsync(…); See also Add async dispose support to ZipArchive which is the exact same issue but for the built-in |
There's some room for improvement there, especially in methods which do several I just wanted to have a quick working proof of concept so duplicating everything and adding async support was faster without refactoring everything. I used a modified version of https://github.com/StephenClearyExamples/AsyncDynamicZip to ensure that the whole chain was actually async. I'll try to write some tests with an AsyncOnlyStream to ensure that no synchronous call was left. |
I mean that you can add a 2.1 target to the build/package without making any other changes, so a seperate PR could be done first to keep that change on its own.
Think I've wondered in the past (in #470 perhaps) if doing many sequential writes for a small amount of fixed size datais the best approach, or if it could be assembled into memory first and then written into the stream in one go |
Oh I see, I misunderstood what you meant. I can do a pull request with just that: adding the .NET Standard 2.1 target framework.
This should probably be benchmarked to make sure what approach is the most performant. Let's hope that writing in memory first is more performant so that we can win on both maintainability and performance! |
Yeah, basically everything that is not actual compressed data should probably just be written to a buffer and then piped asynchronously or not to the base stream. |
This might be getting too far ahead of things, but something to consider in case - The way that ZipOutputStream and GZipOutputStream both derive from DeflatorOutputStream could possibly cause issues if Async functions are added in DeflatorOutputStream but not GZipOutputStream (e.g. does adding a FinishAsync to DeflatorOutputStream mean that someone could call that via an instance of GZipOutputStream and have it do the wrong thing?) |
@Numpsy If we don't implement them, I guess they should be made virtual and overridden by |
Another possible approach to avoid code duplication is using a t4 text template to generate both sync and async versions of the same code. I saw this today in the CsvHelper library. Not sure what is the most maintainable... |
I think this a pretty good use case for meta programming. Too bad the tooling support for T4 is terrible. Maybe that has changed though, haven't tried it with OmniSharp or Rider... |
Rider got support for T4 Text Templates in version 2019.3 and it's pretty decent. |
Actually, that link pointed to another problem; github doesn't highlight them :/ I made an attempt at splitting the logic from the "actual" writes so that just the writes could be put in separate Sync/Async methods in #574. The methods like |
Awesome work, Nils! I'm closing this pull request in favour of yours (much improved) in #574. |
This is a very rough first pass at adding async support to ZipOutputStream.
All method doing synchronous stream writes have been duplicated with an asynchronous counterpart, starting from
ZipOutputStream.PutNextEntryAsync()
. Unfortunately, this makes a lot of duplicated code but is unavoidable to have both sync + async code paths..NET Standard 2.1 has been added to the target frameworks. It's required to get full async support with
Stream.DisposeAsync()
.New unit tests must be written to cover the new async code paths.
Fixes #223
I certify that I own, and have sufficient rights to contribute, all source code and related material intended to be compiled or integrated with the source code for the SharpZipLib open source product (the "Contribution"). My Contribution is licensed under the MIT License.