You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: FAQ.md
+25-34Lines changed: 25 additions & 34 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,4 @@
1
1
1.[Does Karafka require Ruby on Rails?](#does-karafka-require-ruby-on-rails)
2
-
1.[Why there used to be an ApplicationController mentioned in the Wiki and some articles?](#why-there-used-to-be-an-applicationcontroller-mentioned-in-the-wiki-and-some-articles)
3
2
1.[Does Karafka require Redis and/or Sidekiq to work?](#does-karafka-require-redis-andor-sidekiq-to-work)
4
3
1.[Could an HTTP controller also consume a fetched message through the Karafka router?](#could-an-http-controller-also-consume-a-fetched-message-through-the-karafka-router)
5
4
1.[Does Karafka require a separate process running?](#does-karafka-require-a-separate-process-running)
@@ -8,7 +7,6 @@
8
7
1.[Why Karafka does not pre-initializes consumers prior to first message from a given topic being received?](#why-karafka-does-not-pre-initializes-consumers-prior-to-first-message-from-a-given-topic-being-received)
9
8
1.[Does Karafka restart dead PG connections?](#does-karafka-restart-dead-pg-connections)
10
9
1.[Does Karafka require gems to be thread-safe?](#does-karafka-require-gems-to-be-thread-safe)
11
-
1.[When Karafka is loaded via railtie in test env, SimpleCov does not track code changes](#when-karafka-is-loaded-via-a-railtie-in-test-env-simplecov-does-not-track-code-changes)
12
10
1.[Can I use Thread.current to store data in between batches?](#can-i-use-threadcurrent-to-store-data-between-batches)
13
11
1.[Why Karafka process does not pick up newly created topics until restarted?](#why-karafka-process-does-not-pick-up-newly-created-topics-until-restarted)
14
12
1.[Why is Karafka not doing work in parallel when I started two processes?](#why-is-karafka-not-doing-work-in-parallel-when-i-started-two-processes)
@@ -40,7 +38,6 @@
40
38
1.[Why, despite setting `initial_offset` to `earliest`, Karafka is not picking up messages from the beginning?](#why-despite-setting-initial_offset-to-earliest-karafka-is-not-picking-up-messages-from-the-beginning)
41
39
1.[Should I TSTP, wait a while, then send TERM or set a longer `shutdown_timeout` and only send a TERM signal?](#should-i-tstp-wait-a-while-then-send-term-or-set-a-longer-shutdown_timeout-and-only-send-a-term-signal)
42
40
1.[Why am I getting `error:0A000086:SSL routines::certificate verify failed` after upgrading Karafka?](#why-am-i-getting-error0a000086ssl-routinescertificate-verify-failed-after-upgrading-karafka)
43
-
1.[Why am I seeing a `karafka_admin` consumer group with a constant lag present?](#why-am-i-seeing-a-karafka_admin-consumer-group-with-a-constant-lag-present)
44
41
1.[Can I consume the same topic independently using two consumers within the same application?](#can-i-consume-the-same-topic-independently-using-two-consumers-within-the-same-application)
45
42
1.[Why am I seeing Broker failed to validate record (invalid_record) error?](#why-am-i-seeing-broker-failed-to-validate-record-invalid_record-error)
46
43
1.[How can I make polling faster?](#how-can-i-make-polling-faster)
@@ -153,7 +150,6 @@
153
150
1.[Is it possible to exclude `karafka-web` related reporting counts from the web UI dashboard?](#is-it-possible-to-exclude-karafka-web-related-reporting-counts-from-the-web-ui-dashboard)
154
151
1.[Can I log errors in Karafka with topic, partition, and other consumer details?](#can-i-log-errors-in-karafka-with-topic-partition-and-other-consumer-details)
155
152
1.[Why did our Kafka consumer start from the beginning after a 2-week downtime, but resumed correctly after a brief stop and restart?](#why-did-our-kafka-consumer-start-from-the-beginning-after-a-2-week-downtime-but-resumed-correctly-after-a-brief-stop-and-restart)
156
-
1.[Why am I experiencing a load error when using Karafka with Ruby 2.7, and how can I fix it?](#why-am-i-experiencing-a-load-error-when-using-karafka-with-ruby-27-and-how-can-i-fix-it)
157
153
1.[Why am I getting `+[NSCharacterSet initialize] may have been in progress in another thread when fork()` error when forking on macOS?](#why-am-i-getting-nscharacterset-initialize-may-have-been-in-progress-in-another-thread-when-fork-error-when-forking-on-macos)
158
154
1.[How does Karafka handle messages with undefined topics, and can they be routed to a default consumer?](#how-does-karafka-handle-messages-with-undefined-topics-and-can-they-be-routed-to-a-default-consumer)
159
155
1.[What happens if an error occurs while consuming a message in Karafka? Will the message be marked as not consumed and automatically retried?](#what-happens-if-an-error-occurs-while-consuming-a-message-in-karafka-will-the-message-be-marked-as-not-consumed-and-automatically-retried)
@@ -225,11 +221,7 @@
225
221
226
222
## Does Karafka require Ruby on Rails?
227
223
228
-
**No**. Karafka is a fully independent framework that can operate in a standalone mode. It can be easily integrated with any Ruby-based application, including those written with Ruby on Rails. Please follow the [Integrating with Ruby on Rails and other frameworks](https://github.com/karafka/karafka/wiki/Integrating-with-Ruby-on-Rails-and-other-frameworks) Wiki section.
229
-
230
-
## Why there used to be an ApplicationController mentioned in the Wiki and some articles?
231
-
232
-
You can name the main application consumer with any name. You can even call it ```ApplicationController``` or anything else you want. Karafka will sort that out, as long as your root application consumer inherits from the ```Karafka::BaseConsumer```. It's not related to Ruby on Rails controllers. Karafka framework used to use the ```*Controller``` naming convention up until Karafka 1.2 where it was changed because many people had problems with name collisions.
224
+
**No**. Karafka is a fully independent framework that can operate in a standalone mode. It can be easily integrated with any Ruby-based application, including those written with Ruby on Rails. Please follow the [Integrating with Ruby on Rails and other frameworks](Integrating-with-Ruby-on-Rails-and-other-frameworks) documentation.
233
225
234
226
## Does Karafka require Redis and/or Sidekiq to work?
235
227
@@ -241,7 +233,7 @@ You can name the main application consumer with any name. You can even call it `
241
233
242
234
## Does Karafka require a separate process running?
243
235
244
-
No, however, it is **recommended**. By default, Karafka requires a separate process (Karafka server) to consume and process messages. You can read about it in the [Consuming messages](https://github.com/karafka/karafka/wiki/Consuming-Messages) section of the Wiki.
236
+
No, however, it is **recommended**. By default, Karafka requires a separate process (Karafka server) to consume and process messages. You can read about it in the [Consuming messages](Consuming-Messages) section of the documentation.
245
237
246
238
Karafka can also be embedded within another process so you do not need to run a separate process. You can read about it [here](Embedding).
247
239
@@ -271,16 +263,12 @@ Because Karafka does not have knowledge about the whole topology of a given Kafk
271
263
272
264
## Does Karafka restart dead PG connections?
273
265
274
-
Karafka, starting from `2.0.16` will automatically release no longer used ActiveRecord connections. They should be handled and reconnected by the Rails connection reaper. You can implement custom logic to reconnect them yourself if needed beyond the reaping frequency. More details on that can be found [here](Active-Record-Connections-Management#dealing-with-dead-database-connections).
266
+
Karafka will automatically release no longer used ActiveRecord connections. They should be handled and reconnected by the Rails connection reaper. You can implement custom logic to reconnect them yourself if needed beyond the reaping frequency. More details on that can be found [here](Active-Record-Connections-Management#dealing-with-dead-database-connections).
275
267
276
268
## Does Karafka require gems to be thread-safe?
277
269
278
270
Yes. Karafka uses multiple threads to process data, similar to how Puma or Sidekiq does it. The same rules apply.
279
271
280
-
## When Karafka is loaded via a railtie in test env, SimpleCov does not track code changes
281
-
282
-
Karafka hooks with railtie to load `karafka.rb`. Simplecov **needs** to be required [before](https://github.com/simplecov-ruby/simplecov#getting-started=) any code is loaded.
283
-
284
272
## Can I use Thread.current to store data between batches?
285
273
286
274
**No**. The first available thread will pick up work from the queue to better distribute work. This means that you should **not** use `Thread.current` for any type of data storage.
@@ -372,7 +360,7 @@ Upon rebalance, all uncommitted offsets will be committed before a given partiti
372
360
373
361
## Can I use Karafka with Ruby on Rails as a part of an internal gem?
374
362
375
-
Karafka 2.x has [Rails auto-detection](https://github.com/karafka/karafka/blob/78ea23f7044b81b7e0c74bb02ad3d2e5a5fa1b7c/lib/karafka/railtie.rb#L19), and it is loaded early, so some components may be available later, e.g., when ApplicationConsumer inherits from BaseConsumer that is provided by the separate gem that needs an initializer.
363
+
Karafka has Rails auto-detection and loads early, so some components may be available later, e.g., when ApplicationConsumer inherits from BaseConsumer that is provided by the separate gem that needs an initializer.
376
364
377
365
Moreover, despite the same code base, some processes (`rails s`, `rails db:migrate`, `sidekiq s`) may not need to know about karafka, and there is no need to load it.
378
366
@@ -576,7 +564,7 @@ To make Kafka accept messages bigger than 1MB, you must change both Kafka and Ka
576
564
577
565
To increase the maximum accepted payload size in Kafka, you can adjust the `message.max.bytes` and `replica.fetch.max.bytes` configuration parameters in the server.properties file. These parameters controls the maximum size of a message the Kafka broker will accept.
578
566
579
-
To allow [WaterDrop](https://github.com/karafka/waterdrop) (Karafka producer) to send bigger messages, you need to:
567
+
To allow WaterDrop (Karafka producer) to send bigger messages, you need to:
580
568
581
569
- set the `max_payload_size` config option to value in bytes matching your maximum expected payload.
582
570
- set `kafka` scoped `message.max.bytes` to the same value.
@@ -792,10 +780,6 @@ class KarafkaApp < Karafka::App
792
780
end
793
781
```
794
782
795
-
## Why am I seeing a `karafka_admin` consumer group with a constant lag present?
796
-
797
-
The `karafka_admin` consumer group was created when using certain admin API operations. After upgrading to karafka `2.0.37` or higher, this consumer group is no longer needed and can be safely removed.
798
-
799
783
## Can I consume the same topic independently using two consumers within the same application?
800
784
801
785
Yes. You can define independent consumer groups operating within the same application. Let's say you want to consume messages from a topic called `event` using two consumers. You can do this as follows:
@@ -1400,7 +1384,23 @@ The `range` strategy has some advantages over the `round-robin` strategy, where
1400
1384
1401
1385
Since data is often related within the same partition, `range` can keep related data processing within the same consumer, which could lead to benefits like better caching or business logic efficiencies. This can be useful, for example, to join records from two topics with the same number of partitions and the same key-partitioning logic.
1402
1386
1403
-
The assignment strategy is not a one-size-fits-all solution and can be changed based on the specific use case. If you want to change the assignment strategy in Karafka, you can set the `partition.assignment.strategy` configuration value to either `range`, `roundrobin` or `cooperative-sticky`. It's important to consider your particular use case, the number of consumers, and the nature of your data when choosing your assignment strategy.
1387
+
The assignment strategy is not a one-size-fits-all solution and can be changed based on the specific use case.
1388
+
1389
+
**Recommended approaches:**
1390
+
1391
+
1.**KIP-848 Consumer Protocol (Kafka 4.0+)** - This is the recommended approach for new deployments:
1392
+
- Set `group.protocol` to `consumer` to use the new protocol
1393
+
- Configure `group.remote.assignor` (e.g., `uniform` or `range`)
2.**Cooperative-Sticky (for older Kafka versions)** - Use when KIP-848 is not available:
1397
+
- Set `partition.assignment.strategy` to `cooperative-sticky`
1398
+
- Provides incremental rebalancing benefits over eager protocols
1399
+
- Good fallback option for teams on older infrastructure
1400
+
1401
+
3.**Legacy strategies** - `range` or `roundrobin` for specific use cases or compatibility requirements
1402
+
1403
+
It's important to consider your Kafka broker version, particular use case, the number of consumers, and the nature of your data when choosing your assignment strategy.
1404
1404
1405
1405
For Kafka 4.0+ with KRaft mode, you can also use the [next-generation consumer group protocol (KIP-848)](Kafka-New-Rebalance-Protocol) with `group.protocol: 'consumer'`, which offers significantly improved rebalance performance.
1406
1406
@@ -1457,7 +1457,7 @@ Karafka provides ways to implement password protection, and you can find detaile
1457
1457
1458
1458
Yes, it's possible to use a Karafka producer without a consumer in two ways:
1459
1459
1460
-
1. You can use [WaterDrop](https://github.com/karafka/waterdrop), a standalone Karafka component for producing Kafka messages. WaterDrop was explicitly designed for use cases where only message production is required, with no need for consumption.
1460
+
1. You can use WaterDrop, a standalone Karafka component for producing Kafka messages. WaterDrop was explicitly designed for use cases where only message production is required, with no need for consumption.
1461
1461
1462
1462
1. Alternatively, if you have Karafka already in your application, avoid running the `karafka server` command, as it won't make sense without any topics to consume. You can run other processes and produce messages from them. In scenarios like that, there is no need to define any routes. `Karafka#producer` should operate without any problems.
1463
1463
@@ -1888,13 +1888,13 @@ It is indicative of a connectivity issue. Let's break down the meaning and impli
1888
1888
1889
1889
1. **Implications for Karafka Web UI**:
1890
1890
1891
-
- If you're experiencing this issue with topics related to Karafka Web UI, it's essential to note that Karafka improved its error handling in version 2.2.2. If you're using an older version, upgrading to the latest Karafka and Karafka Web UI versions might alleviate the issue.
1891
+
- If you're experiencing this issue with topics related to Karafka Web UI, upgrading to the latest Karafka and Karafka Web UI versions is recommended as error handling has been continuously improved.
1892
1892
1893
1893
- Another scenario where this error might pop up is during rolling upgrades of the Kafka cluster. If the Karafka Web UI topics have a replication factor 1, there's no redundancy for the partition data. During a rolling upgrade, as brokers are taken down sequentially for upgrades, there might be brief windows where the partition's data isn't available due to its residing broker being offline.
1894
1894
1895
1895
Below, you can find a few recommendations in case you encounter this error:
1896
1896
1897
-
1. **Upgrade Karafka**: If you're running a version older than `2.2.2`, consider upgrading both Karafka and Karafka Web UI. This might resolve the issue if it's related to previous error-handling mechanisms.
1897
+
1. **Upgrade Karafka**: Always use the latest stable versions of Karafka and Karafka Web UI to benefit from improved error handling and bug fixes.
1898
1898
1899
1899
1. **Review Configurations**: Examine your Karafka client configurations, especially timeouts and broker addresses, to ensure they're set appropriately.
1900
1900
@@ -2130,15 +2130,6 @@ This issue is likely due to the `offsets.retention.minutes` setting in Kafka. Ka
2130
2130
2131
2131
You can read more about this behavior [here](Operations-Development-vs-Production#configure-your-brokers-offsetsretentionminutes-policy).
2132
2132
2133
-
## Why am I experiencing a load error when using Karafka with Ruby 2.7, and how can I fix it?
2134
-
2135
-
If you're experiencing a load error with Karafka on Ruby 2.7, it's due to a bug in Bundler. To fix this:
- **Update RubyGems to v3.4.22**: Run `gem update --system 3.4.22 --no-document`.
2139
-
2140
-
Note: Ruby 2.7 is EOL and no longer supported. For better security and functionality, upgrading to Ruby 3.0 or higher is highly recommended.
2141
-
2142
2133
## Why am I getting `+[NSCharacterSet initialize] may have been in progress in another thread when fork()` error when forking on macOS?
2143
2134
2144
2135
When running a Rails application with Karafka and Puma on macOS, hitting the Karafka dashboard or endpoints can cause crashes with an error related to fork() and Objective-C initialization. This is especially prevalent in Puma's clustered mode.
Copy file name to clipboardExpand all lines: Operations/Deployment.md
+42-1Lines changed: 42 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -553,7 +553,48 @@ When deploying Karafka consumers using Kubernetes, it's generally not recommende
553
553
554
554
For larger deployments with many consumer processes, it's especially important to be mindful of the rebalancing issue.
555
555
556
-
Overall, when deploying Karafka consumers using Kubernetes, it's important to consider the deployment strategy carefully and to choose a strategy that will minimize the risk of rebalancing issues. By using the `Recreate` strategy and configuring Karafka static group memberships and `cooperative.sticky` rebalance strategy settings, you can ensure that your Karafka application stays reliable and performant, even during large-scale deployments.
556
+
Overall, when deploying Karafka consumers using Kubernetes, it's important to consider the deployment strategy carefully and to choose a strategy that will minimize the risk of rebalancing issues. By using the `Recreate` strategy and configuring Karafka with appropriate rebalancing strategies, you can ensure that your Karafka application stays reliable and performant.
557
+
558
+
### Choosing the Right Rebalance Strategy
559
+
560
+
**For teams running Kafka 4.0+:**
561
+
562
+
- Use the new **KIP-848 consumer protocol** (`group.protocol`set to `consumer`) as your primary choice
Copy file name to clipboardExpand all lines: Pro/Long-Running-Jobs.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -117,6 +117,8 @@ class KarafkaApp < Karafka::App
117
117
end
118
118
```
119
119
120
+
Both strategies help avoid unnecessary partition revocations when partitions would be re-assigned back to the same process.
121
+
120
122
### Revocation and re-assignment
121
123
122
124
In the case of scenario `2`, there is nothing you need to do. Karafka will continue processing your messages and resume partition after it is done with the work.
0 commit comments