Skip to content

Commit

Permalink
Merge: Add docs on partition handling (#85, !60)
Browse files Browse the repository at this point in the history
  • Loading branch information
AJ Steers committed Apr 9, 2021
2 parents c15551f + b5317fe commit 09ea0ee
Show file tree
Hide file tree
Showing 4 changed files with 19 additions and 14 deletions.
5 changes: 3 additions & 2 deletions docs/implementation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,9 @@ This section documents certain behaviors and expectations of the Singer SDK fram

1. [CLI](./cli.md)
2. [Discovery](./discovery.md)
3. [Partitioning](./partitioning.md)
4. [State](./state.md)
3. [Metadata](./discovery.md)
4. [Metrics](./discovery.md)
5. [State](./state.md)

## How to use the implementation reference material

Expand Down
8 changes: 4 additions & 4 deletions docs/implementation/state.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ here.

- _**Preview Feature Notice:** As of version `0.1.0`, the partitioned state feature is in preview status. Implementation details specified here should not be considered final._

The SDK implements a feature called [partitioning](./partitioning.md) which allows the same
The SDK implements a feature called [partitioning](../partitioning.md) which allows the same
stream to be segmented by one or more partitioning indexes. The collection of indexes
which uniquely describe a partition are referred to as the partition's 'context'.

Expand Down Expand Up @@ -108,16 +108,16 @@ others.

The SDK's implementation of `replication_key` is intentionally within the
framework of a _singular_ column comparison. Most of those use cases which previously
required multiple bookmarks can now be handled using the [partitioning](./partitioning.md)
required multiple bookmarks can now be handled using the [partitioning](../partitioning.md)
feature.

While legacy taps have sometimes supported multiple replication key properties,
this is not yet a supported use case within the SDK. If your source requires multiple
bookmark keys, and if it does not align with the [partitioning](./partitioning.md) feature,
bookmark keys, and if it does not align with the [partitioning](../partitioning.md) feature,
please open an issue with a detailed description of the intended use case.

## See Also

- [Singer SDK Partitioning](./partitioning.md)
- [Singer SDK Partitioning](../partitioning.md)
- [Singer Spec: State Overview](https://github.com/singer-io/getting-started/blob/master/docs/SPEC.md#state)
- [Singer Spec: Config and State](https://github.com/singer-io/getting-started/blob/master/docs/CONFIG_AND_STATE.md#state-file)
17 changes: 10 additions & 7 deletions docs/implementation/partitioning.md → docs/partitioning.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# [Singer SDK Implementation Details](/.README.md) - Stream Partitioning
# Stream Partitioning

The Singer SDK supports stream partitioning, meaning a set of substreams
each have their own STATE and their own distinct queryable domain.
which each have their own state and their own distinct queryable domain.

## If you do not require partitioning

Expand All @@ -22,12 +22,15 @@ partition, such as `Stream.get_records()`.

## If you are unsure if partitioning will be needed

If you are _unsure_ of whether the stream will be partitioned or not, you can always just
pass along the `partition` argument to any other methods which accept it.
If you are _unsure_ of whether the stream will be partitioned or not, you can always
pass along the `partition` argument to any other methods which accept it. This will
work regardless of whether partition is an actual partition context or `None`, meaning
no partition is specified.

For example, developers may always call `Stream.get_stream_or_partition_state(partition)`,
which retrieves a writable copy of the state for _either_ the stream (if `partition`
is `None`) or for the `partition` (if `partition` is not `None`).
When dealing with state, for example, developers may always call
`Stream.get_stream_or_partition_state(partition)` even if partition is not set.
The method will automatically return the state that is appropriate, either for the partition
or for the stream.

## See Also

Expand Down
3 changes: 2 additions & 1 deletion docs/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ A method which should retrieve data from the source and return records. To optim

Note:

- This method takes an optional `partition` argument, which you can safely disregard unless you require partition handling.
- This method takes an optional `partition` argument, which can be safely ignored unless
the stream requires [partitioning](./partitioning.md).
- Only custom stream types need to define this method. REST and GraphQL streams do not.

## `RESTStream` Class
Expand Down

0 comments on commit 09ea0ee

Please sign in to comment.