Replies: 7 comments
-
Might be some overlap with #6531 |
Beta Was this translation helpful? Give feedback.
-
Further thoughts. |
Beta Was this translation helpful? Give feedback.
-
@baynes You can specify SubscriptionInitialPosition.earliest when you create the regex subscription. |
Beta Was this translation helpful? Give feedback.
-
I guess that would do it. It changes the behavior for topics created while the client is down or before it is started for the first time. It is probably useful to read all the messages on those when the client comes up - but there could be cases where it is undesirable when the client starts for the first time. If we do go that way then #6531 must be fixed as we are actually using functions. |
Beta Was this translation helpful? Give feedback.
-
@baynes noted. |
Beta Was this translation helpful? Give feedback.
-
The same applies to a partitioned consumer. IMO, when a consumer found new topics/partitions, the subscription initial position should be changed to earliest no matter what the original initial position is. Usually consumers use latest initial position to discard outdated messages. However, assuming that partitions were dynamic increased, i.e. there're some producers and consumers serving this partitioned topic currently. If producers found the increased partitions before consumers, in consumer's view, those messages before it consumes shouldn't be considered outdated. What do you think of this change? @sijie |
Beta Was this translation helpful? Give feedback.
-
@sijie - What is the official position on this? Is it suggested to use earliest? I see it has gone stale and has not been updated for two years. We are running into this issue, which is counter-intuitive to how a queue should work. |
Beta Was this translation helpful? Give feedback.
-
Describe the bug
When a new topic is detected by a regexp subscription it takes time before the subscriptions cursor is set up for that topic. As the cursor is set to the end of the topic this means at least one message is lost and as this can take 40 seconds, one could lose 40 seconds of data.
To Reproduce
If I set up a consumer with a regex subscription, for example:
/opt/pulsar/bin/pulsar-client consume --regex '.*' -s all -n 0
I then send a message on a NEW topic that matches the regex.
/opt/pulsar//bin/pulsar-client produce addtopic -m 'm1'
The consumer detects the new topic and sets up a subscription to it. This can take 30-40 seconds. However it does not see the message (or any other messages sent befor the subscription is set up)
Once it is set up, sending more data to the topic will be picked up by the consumer.
/opt/pulsar//bin/pulsar-client produce addtopic -m 'm2'
The consumer will display the message 'm2'.
So though it works from now on, potentially the first 40 seconds of data have been lost.
Expected behavior
All messages sent to the new topic should be seen by the consumer.
Screenshots
N/A
Desktop (please complete the following information):
Centos 7
Pulsar 2.5.0, 2.5.1
Additional context
The initial message(s) are on the topic, one can see them with a reader. So a solution would be for the cursor for the new topic subscription be created pointing to the start of the topic rather then the normal end in this case.
Beta Was this translation helpful? Give feedback.
All reactions