Skip to content

Commit 3008c40

Browse files
authored
Build docs for all pushes and PRs (#598)
1 parent eff0f8e commit 3008c40

File tree

11 files changed

+86
-131
lines changed

11 files changed

+86
-131
lines changed
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,16 @@
1-
name: Publish Docs
1+
name: Build documentation
22

33
on:
44
push:
5-
branches:
6-
- master
7-
8-
permissions:
9-
actions: read
10-
pages: write
11-
id-token: write
5+
pull_request:
126

137
jobs:
14-
build-and-deploy:
8+
build:
159
runs-on: ubuntu-latest
1610
steps:
1711
- name: Checkout Repository
1812
uses: actions/checkout@v6
1913

20-
- name: Setup Python
21-
uses: actions/setup-python@v6
22-
with:
23-
python-version: 3.x
24-
25-
- name: Run Preprocessing Script
26-
run: python docs/tools/preprocess_docs.py
27-
2814
- name: Setup .NET
2915
uses: actions/setup-dotnet@v5
3016
with:
@@ -34,13 +20,25 @@ jobs:
3420
run: dotnet tool update -g docfx
3521

3622
- name: Build Documentation
37-
run: docfx docfx.json
23+
run: docfx --warningsAsErrors docfx.json
3824
working-directory: ./docs
3925

4026
- name: Upload Site Artifact
4127
uses: actions/upload-pages-artifact@v4
4228
with:
4329
path: './docs/_site'
4430

31+
deploy:
32+
if: github.event_name == 'push' && github.ref == 'refs/heads/master' && !github.event.repository.fork
33+
runs-on: ubuntu-latest
34+
needs: build
35+
permissions:
36+
pages: write
37+
id-token: write
38+
environment:
39+
name: github-pages
40+
url: ${{ steps.deployment.outputs.page_url }}
41+
steps:
4542
- name: Deploy to GitHub Pages
43+
id: deployment
4644
uses: actions/deploy-pages@v4

docs/guides/Arrow.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,14 @@ The Apache Parquet C++ library provides APIs for reading and writing data in the
44
These are wrapped by ParquetSharp using the [Arrow C data interface](https://arrow.apache.org/docs/format/CDataInterface.html)
55
to allow high performance reading and writing of Arrow data with zero copying of array data between C++ and .NET.
66

7-
The Arrow API is contained in the `ParquetSharp.Arrow` namespace,
7+
The Arrow API is contained in the @ParquetSharp.Arrow namespace,
88
and included in the [ParquetSharp NuGet package](https://www.nuget.org/packages/ParquetSharp/).
99

1010
## Reading Arrow data
1111

12-
Reading Parquet data in Arrow format uses a `ParquetSharp.Arrow.FileReader`.
13-
This can be constructed using a file path, a .NET `System.IO.Stream`,
14-
or a subclass of `ParquetSharp.IO.RandomAccessFile`.
12+
Reading Parquet data in Arrow format uses a @ParquetSharp.Arrow.FileReader.
13+
This can be constructed using a file path, a .NET @System.IO.Stream,
14+
or a subclass of @ParquetSharp.IO.RandomAccessFile.
1515
In this example, we'll open a file using a path:
1616

1717
```csharp
@@ -68,9 +68,9 @@ the reader properties, discussed below.
6868

6969
### Reader properties
7070

71-
The `ParquetSharp.Arrow.FileReader` constructor accepts an instance of
72-
`ParquetSharp.ReaderProperties` to control standard Parquet reading behaviour,
73-
and additionally accepts an instance of `ParquetSharp.Arrow.ArrowReaderProperties`
71+
The @ParquetSharp.Arrow.FileReader constructor accepts an instance of
72+
@ParquetSharp.ReaderProperties to control standard Parquet reading behaviour,
73+
and additionally accepts an instance of @ParquetSharp.Arrow.ArrowReaderProperties
7474
to customise Arrow specific behaviour:
7575

7676
```csharp
@@ -94,7 +94,7 @@ using var fileReader = new FileReader(
9494

9595
## Writing Arrow data
9696

97-
The `ParquetSharp.Arrow.FileWriter` class allows writing Parquet files
97+
The @ParquetSharp.Arrow.FileWriter class allows writing Parquet files
9898
using Arrow format data.
9999

100100
In this example we'll walk through writing a file with a timestamp,
@@ -134,15 +134,15 @@ RecordBatch GetBatch(int batchNumber) =>
134134
}, numIds);
135135
```
136136

137-
Now we create a `ParquetSharp.Arrow.FileWriter`, specifying the path to write to and the
137+
Now we create a @ParquetSharp.Arrow.FileWriter, specifying the path to write to and the
138138
file schema:
139139

140140
```csharp
141141
using var writer = new FileWriter("data.parquet", schema);
142142
```
143143

144-
Rather than specifying a file path, we could also write to a .NET `System.IO.Stream`
145-
or a subclass of `ParquetSharp.IO.OutputStream`.
144+
Rather than specifying a file path, we could also write to a .NET @System.IO.Stream
145+
or a subclass of @ParquetSharp.IO.OutputStream.
146146

147147
### Writing data in batches
148148

@@ -207,9 +207,9 @@ writer.Close();
207207

208208
### Writer properties
209209

210-
The `ParquetSharp.Arrow.FileWriter` constructor accepts an instance of
211-
`ParquetSharp.WriterProperties` to control standard Parquet writing behaviour,
212-
and additionally accepts an instance of `ParquetSharp.Arrow.ArrowWriterProperties`
210+
The @ParquetSharp.Arrow.FileWriter constructor accepts an instance of
211+
@ParquetSharp.WriterProperties to control standard Parquet writing behaviour,
212+
and additionally accepts an instance of @ParquetSharp.Arrow.ArrowWriterProperties
213213
to customise Arrow specific behaviour:
214214

215215
```csharp

docs/guides/Encryption.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Double wrapping is enabled by default.
2727
For further details, see the
2828
[Key Management Tools design document](https://docs.google.com/document/d/1bEu903840yb95k9q2X-BlsYKuXoygE4VnMDl9xz_zhk).
2929

30-
The Key Management Tools API is contained in the `ParquetSharp.Encryption` namespace.
30+
The Key Management Tools API is contained in the @ParquetSharp.Encryption namespace.
3131
In order to use this API,
3232
a client for a Key Management Service must be implemented:
3333

@@ -55,7 +55,7 @@ internal sealed class MyKmsClient : IKmsClient
5555
```
5656

5757
The main entrypoint for the Key Management Tools API is the
58-
`ParquetSharp.Encryption.CryptoFactory` class.
58+
@ParquetSharp.Encryption.CryptoFactory class.
5959
This requires a factory method for creating KMS clients,
6060
which are cached internally and periodically recreated:
6161

@@ -76,7 +76,7 @@ kmsConnectionConfig.KmsInstanceUrl = ...;
7676
kmsConnectionConfig.KeyAccessToken = ...;
7777
```
7878

79-
Then to configure how the file is encrypted, an `ParquetSharp.Encryption.EncryptionConfiguration` is created:
79+
Then to configure how the file is encrypted, an @ParquetSharp.Encryption.EncryptionConfiguration is created:
8080

8181
```c#
8282
string footerKeyId = ...;
@@ -113,7 +113,7 @@ encryptionConfig.PlaintextFooter = true;
113113
```
114114

115115
The `kmsConnectionConfig` and `encryptionConfiguration` are used to generate
116-
file encryption properties, which are used to build the `ParquetSharp.WriterProperties`:
116+
file encryption properties, which are used to build the @ParquetSharp.WriterProperties:
117117

118118
```c#
119119
using var fileEncryptionProperties = cryptoFactory.GetFileEncryptionProperties(
@@ -126,7 +126,7 @@ using var writerProperties = writerPropertiesBuilder
126126
.Build();
127127
```
128128

129-
Finally, the Parquet file can be written using the `ParquetSharp.WriterProperties`:
129+
Finally, the Parquet file can be written using the @ParquetSharp.WriterProperties:
130130

131131
```c#
132132
Column[] columns = ...;
@@ -136,9 +136,9 @@ using var fileWriter = new ParquetFileWriter(parquetFilePath, columns, writerPro
136136

137137
### Reading Encrypted Files
138138

139-
Reading encrypted files requires creating `ParquetSharp.FileDecryptionProperties`
140-
with a `ParquetSharp.Encryption.CryptoFactory`, and adding these to the
141-
`ParquetSharp.ReaderProperties`:
139+
Reading encrypted files requires creating @ParquetSharp.FileDecryptionProperties
140+
with a @ParquetSharp.Encryption.CryptoFactory, and adding these to the
141+
@ParquetSharp.ReaderProperties:
142142

143143
```c#
144144
using var decryptionConfig = new DecryptionConfiguration();
@@ -164,16 +164,16 @@ Key material is stored inside the Parquet file metadata by default,
164164
but key material can also be stored in separate JSON files alongside Parquet files,
165165
to allow rotation of master keys without needing to rewrite the Parquet files.
166166

167-
This is configured in the `ParquetSharp.Encryption.EncryptionConfiguration`:
167+
This is configured in the @ParquetSharp.Encryption.EncryptionConfiguration:
168168

169169
```c#
170170
using var encryptionConfig = new EncryptionConfiguration(footerKeyId);
171171
encryptionConfig.InternalKeyMaterial = false; // External key material
172172
```
173173

174174
When using external key material, the path to the Parquet file being written or read
175-
must be specified when creating `ParquetSharp.FileEncryptionProperties` and
176-
`ParquetSharp.FileDecryptionProperties`:
175+
must be specified when creating @ParquetSharp.FileEncryptionProperties and
176+
@ParquetSharp.FileDecryptionProperties:
177177

178178
```c#
179179
using var fileEncryptionProperties = cryptoFactory.GetFileEncryptionProperties(
@@ -247,7 +247,7 @@ using var fileDecryptionProperties = builder
247247
```
248248

249249
Rather than having to specify decryption keys directly, a
250-
`ParquetSharp.DecryptionKeyRetriever` can be used to retrieve keys
250+
@ParquetSharp.DecryptionKeyRetriever can be used to retrieve keys
251251
based on the key metadata, to allow more flexibility:
252252

253253
```c#
@@ -298,7 +298,7 @@ using var fileDecryptionProperties = builder
298298

299299
If the AAD prefix doesn't match the expected prefix an exception will be thrown when reading the file.
300300

301-
Alternatively, you can implement an `ParquetSharp.AadPrefixVerifier` if you have more complex verification logic:
301+
Alternatively, you can implement an @ParquetSharp.AadPrefixVerifier if you have more complex verification logic:
302302

303303
```c#
304304
internal sealed class MyAadVerifier : ParquetSharp.AadPrefixVerifier
@@ -324,8 +324,8 @@ using var fileDecryptionProperties = builder
324324

325325
## Arrow API Compatibility
326326

327-
Note that the above examples use the `ParquetSharp.ParquetFileReader` and
328-
`ParquetSharp.ParquetFileWriter` classes, but encryption may also be used with the Arrow API.
329-
The `ParquetSharp.Arrow.FileReader` and `ParquetSharp.Arrow.FileWriter` constructors
330-
accept `ParquetSharp.ReaderProperties` and `ParquetSharp.WriterProperties` parameters
327+
Note that the above examples use the @ParquetSharp.ParquetFileReader and
328+
@ParquetSharp.ParquetFileWriter classes, but encryption may also be used with the Arrow API.
329+
The @ParquetSharp.Arrow.FileReader and @ParquetSharp.Arrow.FileWriter constructors
330+
accept @ParquetSharp.ReaderProperties and @ParquetSharp.WriterProperties parameters
331331
respectively, which can have encryption properties configured.

docs/guides/Nested.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ but the Parquet format can be used to represent data with a complex nested struc
77

88
In order to write a file with nested columns,
99
we must define the Parquet file schema explicitly as a graph structure using schema nodes,
10-
rather than using ParquetSharp's `ParquetSharp.Column` type.
10+
rather than using ParquetSharp's @ParquetSharp.Column type.
1111

1212
Imagine we have the following JSON object we would like to store as Parquet:
1313

@@ -41,8 +41,8 @@ or we had a non-null object with a null `message` and null `ids`.
4141
Instead, we will represent this data in Parquet with a single
4242
`objects` column.
4343

44-
In order to define the schema, we will be using `ParquetSharp.Schema.PrimitiveNode`
45-
and `ParquetSharp.Schema.GroupNode`.
44+
In order to define the schema, we will be using @ParquetSharp.Schema.PrimitiveNode
45+
and @ParquetSharp.Schema.GroupNode.
4646

4747
In the Parquet schema, we have one one top-level group node named `objects`,
4848
which contains two nested fields, `ids` and `message`.
@@ -74,7 +74,7 @@ using var schema = new GroupNode(
7474

7575
### Writing data
7676

77-
We can then create a `ParquetSharp.ParquetFileWriter` with this schema:
77+
We can then create a @ParquetSharp.ParquetFileWriter with this schema:
7878

7979
```csharp
8080
using var propertiesBuilder = new WriterPropertiesBuilder();
@@ -85,7 +85,7 @@ using var fileWriter = new ParquetFileWriter("objects.parquet", schema, writerPr
8585

8686
When writing data to this file,
8787
the leaf-level values written must be nested within ParquetSharp's
88-
`ParquetSharp.Nested` type to indicate they are contained in a group,
88+
@ParquetSharp.Nested type to indicate they are contained in a group,
8989
and allow nullable nested structures to be represented unambiguously.
9090

9191
For example, both the `objects` and `message` fields are optional,

docs/guides/PowerShell.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# ParquetSharp in PowerShell
22

3-
The main requirement to using ParquetSharp from PowerShell is that `ParquetSharpNative.dll` is in the `PATH` or in the same directory as `ParquetSharp.dll`. The following guide shows one possible approach to achieve this:
3+
The main requirement to using ParquetSharp from PowerShell is that @ParquetSharpNative.dll is in the `PATH` or in the same directory as @ParquetSharp.dll. The following guide shows one possible approach to achieve this:
44

55
### Installation
66

@@ -23,7 +23,7 @@ Copy-Item -Path ".\lib\System.Runtime.CompilerServices.Unsafe.4.5.3\lib\net461\S
2323
Copy-Item -Path ".\lib\System.ValueTuple.4.5.0\lib\net461\System.ValueTuple.dll" -Destination ".\bin"
2424
```
2525

26-
Finally, copy `ParquetSharp.dll` and `ParquetSharpNative.dll` into `bin`. This will depend on the current version of ParquetSharp, as well as your architecture and OS:
26+
Finally, copy @ParquetSharp.dll and @ParquetSharpNative.dll into `bin`. This will depend on the current version of ParquetSharp, as well as your architecture and OS:
2727

2828
```powershell
2929
# Replace path with the appropriate version of ParquetSharp
@@ -36,7 +36,7 @@ Copy-Item -Path ".\lib\ParquetSharp.12.1.0\runtimes\win-x64\native\ParquetSharpN
3636
The available runtime architectures are `win-x64`, `linux-x64`, `linux-arm64`, `osx-x64`, and `osx-arm64`.
3737

3838
### Usage
39-
Use `Add-Type` to load `ParquetSharp.dll`. Note that we're using custom directories:
39+
Use `Add-Type` to load @ParquetSharp.dll. Note that we're using custom directories:
4040

4141
```powershell
4242
# Replace path with the appropriate versions of ParquetSharp

docs/guides/Reading.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# Reading Parquet files
22

3-
The low-level ParquetSharp API provides the `ParquetSharp.ParquetFileReader` class for reading Parquet files.
3+
The low-level ParquetSharp API provides the @ParquetSharp.ParquetFileReader class for reading Parquet files.
44
This is usually constructed from a file path, but may also be constructed from a
5-
`ParquetSharp.IO.ManagedRandomAccessFile`, which wraps a .NET `System.IO.Stream` that supports seeking.
5+
@ParquetSharp.IO.ManagedRandomAccessFile, which wraps a .NET @System.IO.Stream that supports seeking.
66

77
```csharp
88
using var fileReader = new ParquetFileReader("data.parquet");
@@ -15,7 +15,7 @@ using var fileReader = new ParquetFileReader(input);
1515

1616
### Obtaining file metadata
1717

18-
The `ParquetSharp.FileMetaData` property of a `ParquetFileReader` exposes information about the Parquet file and its schema:
18+
The @ParquetSharp.FileMetaData property of a `ParquetFileReader` exposes information about the Parquet file and its schema:
1919

2020
```csharp
2121
int numColumns = fileReader.FileMetaData.NumColumns;
@@ -34,7 +34,7 @@ for (int columnIndex = 0; columnIndex < schema.NumColumns; ++columnIndex) {
3434

3535
Parquet files store data in separate row groups, which all share the same schema,
3636
so if you wish to read all data in a file, you generally want to loop over all of the row groups
37-
and create a `ParquetSharp.RowGroupReader` for each one:
37+
and create a @ParquetSharp.RowGroupReader for each one:
3838

3939
```csharp
4040
for (int rowGroup = 0; rowGroup < fileReader.FileMetaData.NumRowGroups; ++rowGroup) {
@@ -45,10 +45,10 @@ for (int rowGroup = 0; rowGroup < fileReader.FileMetaData.NumRowGroups; ++rowGro
4545

4646
### Reading columns directly
4747

48-
The `Column` method of `RowGroupReader` takes an integer column index and returns a `ParquetSharp.ColumnReader` object,
48+
The `Column` method of `RowGroupReader` takes an integer column index and returns a @ParquetSharp.ColumnReader object,
4949
which can read primitive values from the column, as well as raw definition level and repetition level data.
5050
Usually you will not want to use a `ColumnReader` directly, but instead call its `LogicalReader` method to
51-
create a `ParquetSharp.LogicalColumnReader` that can read logical values.
51+
create a @ParquetSharp.LogicalColumnReader that can read logical values.
5252
There are two variations of this `LogicalReader` method; the plain `LogicalReader` method returns an abstract
5353
`LogicalColumnReader`, whereas the generic `LogicalReader<TElement>` method returns a typed `LogicalColumnReader<TElement>`,
5454
which reads values of the specified element type.
@@ -96,7 +96,7 @@ When reading Timestamp to a DateTime, ParquetSharp sets the DateTimeKind based o
9696

9797
If `IsAdjustedToUtc` is `true` the DateTimeKind will be set to `DateTimeKind.Utc` otherwise it will be set to `DateTimeKind.Unspecified`.
9898

99-
This behavior can be overwritten by setting the AppContext switch `ParquetSharp.ReadDateTimeKindAsUnspecified` to `true`, so the DateTimeKind will be always set to `DateTimeKind.Unspecified` regardless of the value of `IsAdjustedToUtc`.
99+
This behavior can be overwritten by setting the AppContext switch @ParquetSharp.ReadDateTimeKindAsUnspecified to `true`, so the DateTimeKind will be always set to `DateTimeKind.Unspecified` regardless of the value of `IsAdjustedToUtc`.
100100
This also matches the old behavior of [ParquetSharp < 7.0.0](https://github.com/G-Research/ParquetSharp/pull/261)
101101

102102
```csharp
@@ -117,7 +117,7 @@ Some legacy implementations of Parquet write timestamps using the Int96 primitiv
117117
which has been [deprecated](https://issues.apache.org/jira/browse/PARQUET-323).
118118
ParquetSharp doesn't support reading Int96 values as .NET `DateTime`s
119119
as not all Int96 timestamp values are representable as a `DateTime`.
120-
However, there is limited support for reading raw Int96 values using the `ParquetSharp.Int96` type
120+
However, there is limited support for reading raw Int96 values using the @ParquetSharp.Int96 type
121121
and it is left to applications to decide how to interpret these values.
122122

123123
## Long path handling

docs/guides/RowOriented.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,8 @@ using (var rowReader = ParquetFile.CreateRowReader<MyRow>("example.parquet"))
7070

7171
## Reading and writing custom types
7272

73-
The row-oriented API supports reading and writing custom types by providing a `ParquetSharp.LogicalTypeFactory`
74-
and a `ParquetSharp.LogicalReadConverterFactory` or `ParquetSharp.LogicalWriteConverterFactory`.
73+
The row-oriented API supports reading and writing custom types by providing a @ParquetSharp.LogicalTypeFactory
74+
and a @ParquetSharp.LogicalReadConverterFactory or @ParquetSharp.LogicalWriteConverterFactory.
7575

7676
### Writing custom types
7777

docs/guides/TimeSpan.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,6 @@ Note that when using this approach, if you read the file back with
110110
ParquetSharp the data will be read as `long` values as there's no
111111
way to tell it was originally `TimeSpan` data.
112112
To read the data back as `TimeSpan`s, you'll also need to implement
113-
a custom `ParquetSharp.LogicalReadConverterFactory` and use the `LogicalReadOverride` method
114-
or provide a custom `ParquetSharp.LogicalTypeFactory`.
113+
a custom @ParquetSharp.LogicalReadConverterFactory and use the `LogicalReadOverride` method
114+
or provide a custom @ParquetSharp.LogicalTypeFactory.
115115
See the [type factories documentation](TypeFactories.md) for more details.

0 commit comments

Comments
 (0)