Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor and improve readme #232

Merged
merged 2 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion OpenMcdf.Perf/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ public static void MultiStorageAndStreamWrite()
int writeCount = 1024;
byte[] buffer = new byte[32 * 512];

Microsoft.IO.RecyclableMemoryStreamManager manager = new ();
Microsoft.IO.RecyclableMemoryStreamManager manager = new();
Microsoft.IO.RecyclableMemoryStream baseStream = new(manager);
baseStream.Capacity = 2 * (storageCount * buffer.Length * writeCount + storageCount * (streamCount - 1) * buffer.Length);

Expand Down
20 changes: 9 additions & 11 deletions OpenMcdf/DirectoryTree.cs
Original file line number Diff line number Diff line change
Expand Up @@ -107,43 +107,40 @@ public void Add(DirectoryEntry entry)
return;
}

uint previous = currentEntry!.LeftSiblingId;
uint next = currentEntry.RightSiblingId;

while (true)
{
int compare = DirectoryEntryComparer.Compare(entry.NameCharSpan, currentEntry!.NameCharSpan);
uint leftId = currentEntry!.LeftSiblingId;
uint rightId = currentEntry.RightSiblingId;

int compare = DirectoryEntryComparer.Compare(entry.NameCharSpan, currentEntry.NameCharSpan);
if (compare < 0)
{
if (previous == StreamId.NoStream)
if (leftId == StreamId.NoStream)
{
currentEntry.LeftSiblingId = entry.Id;
directories.Write(currentEntry);
directories.Write(entry);
return;
}

currentEntry = directories.GetDictionaryEntry(previous);
currentEntry = directories.GetDictionaryEntry(leftId);
}
else if (compare > 0)
{
if (next == StreamId.NoStream)
if (rightId == StreamId.NoStream)
{
currentEntry.RightSiblingId = entry.Id;
directories.Write(currentEntry);
directories.Write(entry);
return;
}

currentEntry = directories.GetDictionaryEntry(next);
currentEntry = directories.GetDictionaryEntry(rightId);
}
else
{
throw new IOException($"{entry.Type} \"{entry.NameString}\" already exists.");
}

previous = currentEntry!.LeftSiblingId;
next = currentEntry!.RightSiblingId;
}
}

Expand Down Expand Up @@ -209,6 +206,7 @@ internal void WriteTrace(TextWriter writer)
WriteTrace(writer, current, 0);
}

[ExcludeFromCodeCoverage]
void WriteTrace(TextWriter writer, DirectoryEntry entry, int indent)
{
directories.TryGetDictionaryEntry(entry.RightSiblingId, out DirectoryEntry? rightSibling);
Expand Down
2 changes: 1 addition & 1 deletion OpenMcdf/Storage.cs
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ public bool TryOpenStorage(string name, [MaybeNullWhen(false)] out Storage? stor

this.ThrowIfDisposed(Context.IsDisposed);

directoryTree.TryGetDirectoryEntry(name, out DirectoryEntry? entry);
directoryTree.TryGetDirectoryEntry(name, out DirectoryEntry? entry);
if (entry is null || entry.Type is not StorageType.Storage)
{
storage = null;
Expand Down
49 changes: 32 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,33 @@
![GitHub Actions](https://github.com/ironfede/openmcdf/actions/workflows/dotnet-desktop.yml/badge.svg)
[![NuGet Version](https://img.shields.io/nuget/vpre/OpenMcdf)](https://www.nuget.org/packages/OpenMcdf)
[![NuGet Downloads](https://img.shields.io/nuget/dt/OpenMcdf)](https://www.nuget.org/packages/OpenMcdf)

# OpenMcdf

OpenMcdf is a 100% .NET / C# component that allows developers to manipulate [Compound File Binary File Format](https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-cfb/53989ce4-7b05-4f8d-829b-d08d6148375b) files (also known as OLE structured storage).
OpenMcdf is a fully .NET / C# library to manipulate [Compound File Binary File Format](https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-cfb/53989ce4-7b05-4f8d-829b-d08d6148375b) files, also known as [Structured Storage](https://learn.microsoft.com/en-us/windows/win32/stg/structured-storage-start-page).

Compound file includes multiple streams of information (document summary, user data) in a single container.
Compound files include multiple streams of information (document summary, user data) in a single container, and is used as the bases for many different file formats:
- Microsoft Office (.doc, .xls, .ppt)
- Windows thumbnails cache files (thumbs.db)
- Outlook messages (.msg)
- Visual Studio Solution Options (.suo)
- Advanced Authoring Format (.aaf)

This file format is used under the hood by a lot of applications: all the documents created by Microsoft Office until the 2007 product release are structured storage files. Windows thumbnails cache files (thumbs.db) are compound documents as well as .msg Outlook messages. Visual Studio .suo files (solution options) are compound files and a lot of audio/video editing tools save project file in a compound container (*.aaf files for example).
OpenMcdf v3 has a rewritten API and supports:
- And idiomatic dotnet API and exception hierarchy
- Fast and efficient enumeration and manipulation of storages and streams
- Files sizes up to 16 TB (using major format version 4 with 4096 byte sectors)
- Transactions (i.e. commit and/or revert)
- Consolidation (i.e. reclamation of space by removing free sectors)
- Nullable attributes

OpenMcdf supports read/write operations on streams and storages and traversal of structures tree. It supports version 3 and 4 of the specifications, uses lazy loading wherever possible to reduce memory usage and offer an intuitive API to work with structured files.
Limitations:
- No support for bed-black tree balancing (directory entries are stored in a tree, but are not balanced. i.e. trees are "all-black")
- No support for single writer, multiple readers

It's very easy to **create a new compound file**
## Getting started

To create a new compound file:

```C#
byte[] b = new byte[10000];
Expand All @@ -20,23 +37,19 @@ using CfbStream stream = root.CreateStream("MyStream");
stream.Write(b, 0, b.Length);
```

You can **open an existing one**, an Excel workbook (.xls) and use its main data stream
To open an Excel workbook (.xls) and access its main data stream:

```C#
using var root = RootStorage.OpenRead("report.xls");
using CfbStream workbookStream = root.OpenStream("Workbook");
```

Adding **storage and stream items** is just as easy...
To create or delete storages and streams:

```C#
using var root = RootStorage.Create("test.cfb");
root.AddStorage("MyStorage");
root.AddStream("MyStream");
```
...as removing them

```C#
root.CreateStorage("MyStorage");
root.CreateStream("MyStream");
root.Delete("MyStream");
```

Expand All @@ -49,14 +62,16 @@ root.Commit();
root.Revert();
```

If you need to compress a compound file, you can purge its unused space:
A root storage can be consolidated to reduce its on-disk size:

```C#
root.Flush(consolidate: true);
```

[OLE Properties](https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-oleps/bf7aeae8-c47a-4939-9f45-700158dac3bc) handling for DocumentSummaryInfo and SummaryInfo streams
is available via extension methods ***(experimental - API subjected to changes)***
## Object Linking and Embedding (OLE) Property Set Data Structures

Support for reading and writing [OLE Properties](https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-oleps/bf7aeae8-c47a-4939-9f45-700158dac3bc)
is available via the OpenMcdf.Ole package. However, ***the API is experimental and subject to changes)***

```C#
OlePropertiesContainer co = new(stream);
Expand All @@ -66,4 +81,4 @@ foreach (OleProperty prop in co.Properties)
}
```

OpenMcdf runs happily on the [Mono](http://www.mono-project.com/) platform and multi-targets **netstandard2.0** and **net8.0** to allow maximum client compatibility.
OpenMcdf runs happily on the [Mono](http://www.mono-project.com/) platform and multi-targets [**netstandard2.0**](https://learn.microsoft.com/en-us/dotnet/standard/net-standard?tabs=net-standard-2-0) and **net8.0** to maximize client compatibility and support modern dotnet features.
2 changes: 1 addition & 1 deletion StructuredStorageExplorer/MainForm.cs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#define OLE_PROPERTY

using OpenMcdf.Ole;
using OpenMcdf;
using OpenMcdf.Ole;
using StructuredStorageExplorer.Properties;
using System.Collections;
using System.ComponentModel;
Expand Down
1 change: 1 addition & 0 deletions exclusion.dic
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ codepage
defragmentation
depersist
DIFAT
dotnet
endian
enqueuing
enum
Expand Down