Skip to content

Commit 363bfee

Browse files
Rate Per Micro-Batch Data Source
1 parent e3fb09c commit 363bfee

File tree

4 files changed

+56
-29
lines changed

4 files changed

+56
-29
lines changed
Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,20 @@
11
# RatePerMicroBatchProvider
22

3-
`RatePerMicroBatchProvider` is a `SimpleTableProvider` ([Spark SQL]({{ book.spark_sql }}/connector/SimpleTableProvider)).
3+
`RatePerMicroBatchProvider` is a `SimpleTableProvider` ([Spark SQL]({{ book.spark_sql }}/connector/SimpleTableProvider)) registered under [rate-micro-batch](#shortName) alias.
44

55
## <span id="DataSourceRegister"><span id="shortName"> DataSourceRegister
66

77
`RatePerMicroBatchProvider` is a `DataSourceRegister` ([Spark SQL]({{ book.spark_sql }}/DataSourceRegister)) that registers `rate-micro-batch` alias.
88

9-
## <span id="getTable"> Creating Table
9+
## Create Table { #getTable }
1010

11-
```scala
12-
getTable(
13-
options: CaseInsensitiveStringMap): Table
14-
```
11+
??? note "SimpleTableProvider"
1512

16-
`getTable` creates a [RatePerMicroBatchTable](RatePerMicroBatchTable.md) with the [options](options.md) (given the `CaseInsensitiveStringMap`).
13+
```scala
14+
getTable(
15+
options: CaseInsensitiveStringMap): Table
16+
```
1717

18-
---
18+
`getTable` is part of the `SimpleTableProvider` ([Spark SQL]({{ book.spark_sql }}/connector/SimpleTableProvider#getTable)) abstraction.
1919

20-
`getTable` is part of the `SimpleTableProvider` ([Spark SQL]({{ book.spark_sql }}/connector/SimpleTableProvider#getTable)) abstraction.
20+
`getTable` creates a [RatePerMicroBatchTable](RatePerMicroBatchTable.md) with the [options](options.md) (given the `CaseInsensitiveStringMap`).

docs/datasources/rate-micro-batch/RatePerMicroBatchTable.md

Lines changed: 29 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -15,28 +15,42 @@
1515

1616
* `RatePerMicroBatchProvider` is requested for the [table](RatePerMicroBatchProvider.md#getTable)
1717

18-
## <span id="schema"> schema
18+
## Table Capabilities { #capabilities }
1919

20-
```scala
21-
schema(): StructType
22-
```
20+
??? note "Table"
21+
22+
```scala
23+
capabilities(): Set[TableCapability]
24+
```
25+
26+
`capabilities` is part of the `Table` ([Spark SQL]({{ book.spark_sql }}/connector/Table#capabilities)) abstraction.
27+
28+
`capabilities` is exactly `MICRO_BATCH_READ` table capability.
29+
30+
## Schema { #schema }
31+
32+
??? note "Table"
33+
34+
```scala
35+
schema(): StructType
36+
```
37+
38+
`schema` is part of the `Table` ([Spark SQL]({{ book.spark_sql }}/connector/Table#schema)) abstraction.
2339

2440
Name | Data Type
2541
-----|----------
26-
timestamp | TimestampType
27-
value | LongType
28-
29-
`schema` is part of the `Table` ([Spark SQL]({{ book.spark_sql }}/connector/Table#schema)) abstraction.
42+
`timestamp` | `TimestampType`
43+
`value` | `LongType`
3044

31-
## <span id="newScanBuilder"> Creating ScanBuilder
45+
## Create ScanBuilder { #newScanBuilder }
3246

33-
```scala
34-
newScanBuilder(
35-
options: CaseInsensitiveStringMap): ScanBuilder
36-
```
47+
??? note "SupportsRead"
3748

38-
`newScanBuilder` is part of the `SupportsRead` ([Spark SQL]({{ book.spark_sql }}/connector/SupportsRead#newScanBuilder)) abstraction.
49+
```scala
50+
newScanBuilder(
51+
options: CaseInsensitiveStringMap): ScanBuilder
52+
```
3953

40-
---
54+
`newScanBuilder` is part of the `SupportsRead` ([Spark SQL]({{ book.spark_sql }}/connector/SupportsRead#newScanBuilder)) abstraction.
4155

4256
`newScanBuilder` creates a new `Scan` ([Spark SQL]({{ book.spark_sql }}/connector/Scan)) that creates a [RatePerMicroBatchStream](RatePerMicroBatchStream.md) when requested for a `MicroBatchStream` ([Spark SQL]({{ book.spark_sql }}/connector/Scan#toMicroBatchStream)).

docs/datasources/rate-micro-batch/index.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
hide:
3+
- toc
4+
---
5+
16
# Rate Per Micro-Batch Data Source
27

38
**Rate Per Micro-Batch Data Source** provides a consistent number of rows per microbatch.
Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,25 @@
11
# Options
22

3-
## <span id="advanceMillisPerBatch"><span id="ADVANCE_MILLIS_PER_BATCH"> advanceMillisPerBatch
3+
## <span id="ADVANCE_MILLIS_PER_BATCH"> advanceMillisPerBatch { #advanceMillisPerBatch }
44

55
default: `1000`
66

7-
## <span id="numPartitions"><span id="NUM_PARTITIONS"> numPartitions
7+
Must be non-negative
88

9-
default: `SparkSession.active.sparkContext.defaultParallelism`
9+
## <span id="NUM_PARTITIONS"> numPartitions { #numPartitions }
1010

11-
## <span id="rowsPerBatch"><span id="ROWS_PER_BATCH"> rowsPerBatch
11+
default: `SparkContext.defaultParallelism`
12+
13+
Must be non-zero and positive
14+
15+
## <span id="ROWS_PER_BATCH"> rowsPerBatch { #rowsPerBatch }
1216

1317
default: `0`
1418

15-
## <span id="startTimestamp"><span id="START_TIMESTAMP"> startTimestamp
19+
Must be non-zero and positive
20+
21+
## <span id="START_TIMESTAMP"> startTimestamp { #startTimestamp }
1622

1723
default: `0`
24+
25+
Must be non-negative

0 commit comments

Comments
 (0)