Skip to content

Commit b38734c

Browse files
2540 Autogenerated files moved to rp-connect-docs (#28)
Co-authored-by: JakeSCahill <[email protected]>
1 parent 04874e0 commit b38734c

File tree

252 files changed

+60191
-9
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

252 files changed

+60191
-9
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
node_modules
22
.vscode
3-
docs
3+
docs
4+
*.DS_Store

local-antora-playbook.yml

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,6 @@ urls:
88
content:
99
sources:
1010
# Fetches autogenerated docs.
11-
- url: https://github.com/redpanda-data/connect
12-
# The latest tag is fetched from GitHub during the build in:
13-
# https://github.com/redpanda-data/docs-extensions-and-macros/blob/main/README.adoc#redpanda-connect-tag-modifier.
14-
# This is a fallback version.
15-
tags: ['v4.29.0']
16-
branches: ~
17-
start_paths: [docs]
1811
- url: .
1912
branches: HEAD
2013
- url: https://github.com/redpanda-data/documentation
@@ -40,7 +33,6 @@ asciidoc:
4033
antora:
4134
extensions:
4235
- require: '@redpanda-data/docs-extensions-and-macros/extensions/generate-rp-connect-categories'
43-
- require: '@redpanda-data/docs-extensions-and-macros/extensions/modify-connect-tag-playbook'
4436
- require: '@redpanda-data/docs-extensions-and-macros/extensions/unpublish-pages'
4537
- require: '@redpanda-data/docs-extensions-and-macros/extensions/unlisted-pages'
4638
- require: '@redpanda-data/docs-extensions-and-macros/extensions/add-global-attributes'

modules/ROOT/nav.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -212,6 +212,7 @@
212212
*** xref:components:outputs/pinecone.adoc[]
213213
*** xref:components:outputs/pulsar.adoc[]
214214
*** xref:components:outputs/pusher.adoc[]
215+
*** xref:components:outputs/qdrant.adoc[]
215216
*** xref:components:outputs/redis_hash.adoc[]
216217
*** xref:components:outputs/redis_list.adoc[]
217218
*** xref:components:outputs/redis_pubsub.adoc[]
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
= memory
2+
:type: buffer
3+
:status: stable
4+
:categories: ["Utility"]
5+
6+
// © 2024 Redpanda Data Inc.
7+
8+
9+
component_type_dropdown::[]
10+
11+
12+
Stores consumed messages in memory and acknowledges them at the input level. During shutdown Redpanda Connect will make a best attempt at flushing all remaining messages before exiting cleanly.
13+
14+
15+
[tabs]
16+
======
17+
Common::
18+
+
19+
--
20+
21+
```yml
22+
# Common config fields, showing default values
23+
buffer:
24+
memory:
25+
limit: 524288000
26+
batch_policy:
27+
enabled: false
28+
count: 0
29+
byte_size: 0
30+
period: ""
31+
check: ""
32+
```
33+
34+
--
35+
Advanced::
36+
+
37+
--
38+
39+
```yml
40+
# All config fields, showing default values
41+
buffer:
42+
memory:
43+
limit: 524288000
44+
batch_policy:
45+
enabled: false
46+
count: 0
47+
byte_size: 0
48+
period: ""
49+
check: ""
50+
processors: [] # No default (optional)
51+
```
52+
53+
--
54+
======
55+
56+
This buffer is appropriate when consuming messages from inputs that do not gracefully handle back pressure and where delivery guarantees aren't critical.
57+
58+
This buffer has a configurable limit, where consumption will be stopped with back pressure upstream if the total size of messages in the buffer reaches this amount. Since this calculation is only an estimate, and the real size of messages in RAM is always higher, it is recommended to set the limit significantly below the amount of RAM available.
59+
60+
== Delivery guarantees
61+
62+
This buffer intentionally weakens the delivery guarantees of the pipeline and therefore should never be used in places where data loss is unacceptable.
63+
64+
== Batching
65+
66+
It is possible to batch up messages sent from this buffer using a xref:configuration:batching.adoc#batch-policy[batch policy].
67+
68+
== Fields
69+
70+
=== `limit`
71+
72+
The maximum buffer size (in bytes) to allow before applying backpressure upstream.
73+
74+
75+
*Type*: `int`
76+
77+
*Default*: `524288000`
78+
79+
=== `batch_policy`
80+
81+
Optionally configure a policy to flush buffered messages in batches.
82+
83+
84+
*Type*: `object`
85+
86+
87+
=== `batch_policy.enabled`
88+
89+
Whether to batch messages as they are flushed.
90+
91+
92+
*Type*: `bool`
93+
94+
*Default*: `false`
95+
96+
=== `batch_policy.count`
97+
98+
A number of messages at which the batch should be flushed. If `0` disables count based batching.
99+
100+
101+
*Type*: `int`
102+
103+
*Default*: `0`
104+
105+
=== `batch_policy.byte_size`
106+
107+
An amount of bytes at which the batch should be flushed. If `0` disables size based batching.
108+
109+
110+
*Type*: `int`
111+
112+
*Default*: `0`
113+
114+
=== `batch_policy.period`
115+
116+
A period in which an incomplete batch should be flushed regardless of its size.
117+
118+
119+
*Type*: `string`
120+
121+
*Default*: `""`
122+
123+
```yml
124+
# Examples
125+
126+
period: 1s
127+
128+
period: 1m
129+
130+
period: 500ms
131+
```
132+
133+
=== `batch_policy.check`
134+
135+
A xref:guides:bloblang/about.adoc[Bloblang query] that should return a boolean value indicating whether a message should end a batch.
136+
137+
138+
*Type*: `string`
139+
140+
*Default*: `""`
141+
142+
```yml
143+
# Examples
144+
145+
check: this.type == "end_of_transaction"
146+
```
147+
148+
=== `batch_policy.processors`
149+
150+
A list of xref:components:processors/about.adoc[processors] to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op.
151+
152+
153+
*Type*: `array`
154+
155+
156+
```yml
157+
# Examples
158+
159+
processors:
160+
- archive:
161+
format: concatenate
162+
163+
processors:
164+
- archive:
165+
format: lines
166+
167+
processors:
168+
- archive:
169+
format: json_array
170+
```
171+
172+
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
= none
2+
:type: buffer
3+
:status: stable
4+
5+
// © 2024 Redpanda Data Inc.
6+
7+
8+
component_type_dropdown::[]
9+
10+
11+
Do not buffer messages. This is the default and most resilient configuration.
12+
13+
```yml
14+
# Config fields, showing default values
15+
buffer:
16+
none: {}
17+
```
18+
19+
Selecting no buffer means the output layer is directly coupled with the input layer. This is the safest and lowest latency option since acknowledgements from at-least-once protocols can be propagated all the way from the output protocol to the input protocol.
20+
21+
If the output layer is hit with back pressure it will propagate all the way to the input layer, and further up the data stream. If you need to relieve your pipeline of this back pressure consider using a more robust buffering solution such as Kafka before resorting to alternatives.
22+
23+
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
= sqlite
2+
:type: buffer
3+
:status: stable
4+
:categories: ["Utility"]
5+
6+
// © 2024 Redpanda Data Inc.
7+
8+
9+
component_type_dropdown::[]
10+
11+
12+
Stores messages in an SQLite database and acknowledges them at the input level.
13+
14+
```yml
15+
# Config fields, showing default values
16+
buffer:
17+
sqlite:
18+
path: "" # No default (required)
19+
pre_processors: [] # No default (optional)
20+
post_processors: [] # No default (optional)
21+
```
22+
23+
Stored messages are then consumed as a stream from the database and deleted only once they are successfully sent at the output level. If the service is restarted Redpanda Connect will make a best attempt to finish delivering messages that are already read from the database, and when it starts again it will consume from the oldest message that has not yet been delivered.
24+
25+
== Delivery guarantees
26+
27+
Messages are not acknowledged at the input level until they have been added to the SQLite database, and they are not removed from the SQLite database until they have been successfully delivered. This means at-least-once delivery guarantees are preserved in cases where the service is shut down unexpectedly. However, since this process relies on interaction with the disk (wherever the SQLite DB is stored) these delivery guarantees are not resilient to disk corruption or loss.
28+
29+
== Batching
30+
31+
Messages that are logically batched at the point where they are added to the buffer will continue to be associated with that batch when they are consumed. This buffer is also more efficient when storing messages within batches, and therefore it is recommended to use batching at the input level in high-throughput use cases even if they are not required for processing.
32+
33+
34+
== Fields
35+
36+
=== `path`
37+
38+
The path of the database file, which will be created if it does not already exist.
39+
40+
41+
*Type*: `string`
42+
43+
44+
=== `pre_processors`
45+
46+
An optional list of processors to apply to messages before they are stored within the buffer. These processors are useful for compressing, archiving or otherwise reducing the data in size before it's stored on disk.
47+
48+
49+
*Type*: `array`
50+
51+
52+
=== `post_processors`
53+
54+
An optional list of processors to apply to messages after they are consumed from the buffer. These processors are useful for undoing any compression, archiving, etc that may have been done by your `pre_processors`.
55+
56+
57+
*Type*: `array`
58+
59+
60+
== Examples
61+
62+
[tabs]
63+
======
64+
Batching for optimization::
65+
+
66+
--
67+
68+
Batching at the input level greatly increases the throughput of this buffer. If logical batches aren't needed for processing add a xref:components:processors/split.adoc[`split` processor] to the `post_processors`.
69+
70+
```yaml
71+
input:
72+
batched:
73+
child:
74+
sql_select:
75+
driver: postgres
76+
dsn: postgres://foouser:foopass@localhost:5432/testdb?sslmode=disable
77+
table: footable
78+
columns: [ '*' ]
79+
policy:
80+
count: 100
81+
period: 500ms
82+
83+
buffer:
84+
sqlite:
85+
path: ./foo.db
86+
post_processors:
87+
- split: {}
88+
```
89+
90+
--
91+
======
92+
93+

0 commit comments

Comments
 (0)