Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reading OLE stream as Stream #62

Closed
phuclv90 opened this issue Dec 16, 2019 · 7 comments
Closed

Support reading OLE stream as Stream #62

phuclv90 opened this issue Dec 16, 2019 · 7 comments

Comments

@phuclv90
Copy link

I need to save a stream (or parts of it) to a file, but in CFStream the only methods that can be used to get data are Read(byte[] buffer, long position, int count) and byte[] GetData(), thus I have to get a byte[] buffer every time and write it to file. As the buffer is larger than 85000 bytes, it's put on the large object heap and the GC won't collect it right away even if it's not used anywhere else. As a result for big streams my app becomes a memory hog when saving big streams and I have to call GC.Collect() manually

I've written a custom class to wrap CFStream that extends System.IO.Stream and calls CFStream.Read() inside its Read overload, but the result is that performance is almost 10 times slower. I debugged and found out that there are a lot of small reads from the stream and a new StreamView is created even when reading just a single byte. After reading GC.Collect() is called, thus there are a lot of GC wake ups in 1 second

I ended up working around the issue by wrapping another layer of System.IO.BufferedStream. But it looks like the issue can be solved much easier and more efficient by exporting StreamView which is currently an internal class. We just need to make it public, or probably some other small changes to make it work

@poizan42
Copy link

Is OpenMcdf.Extensions.CFStreamExtension.AsIOStream not good enough?

@phuclv90
Copy link
Author

@poizan42 I didn't know about that. There's a single OpenMcdf.dll in the current project and I've looked around it to no avail. Searching repo doesn't help either because there are so many CFStream in the result

@poizan42
Copy link

It's in OpenMcdf.Extensions: https://www.nuget.org/packages/OpenMcdf.Extensions

@phuclv90
Copy link
Author

I've checked the source code and it uses cfStream.Read which still results in terrible performance for small/random reads/writes

@IS4Code
Copy link

IS4Code commented Jun 4, 2021

StreamView directly doesn't have to be public, but CFStream should have a method Open/GetStream which constructs it. The only thing missing is to disable writing when the file is opened with CFSUpdateMode.ReadOnly.

@jeremy-visionaid
Copy link
Collaborator

This should be now be addressed as part of #194 Small reads should be fast and efficient in the proof of concept.

@jeremy-visionaid
Copy link
Collaborator

I'm hopeful that 3.0.0-preview1 should get released to NuGet in the next few days. @ironfede has kindly granted me write access to the repo, so I'll go ahead and close this one as fixed by #194.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants