11-06-2018 12:59 AM
Hi Team,
I have a requirement to read/Consume messages from Kafka on daily basis. Some other team write messages to kafka on daily basis. My exact need is to read the messages for that particular day only. for example, first day 500 messages are written to kafka. I need to consume those 500 messages on firstday. Next day 400 messages are written to kafka. I need to consume those 400 messages from 501th message.
Developed till now :
I could able to develop the above one. but, If i get any error after reading some messages, Can I able to read again from the 501th message?
Appreciate your quick response
Thanks in Advance!
11-06-2018 01:02 PM
Have you explored the option of using the Confluent Kafka Acknowledge Snap? The goal should be wait until all messages get consumed and then start acknowledging instead of using the auto-commit property which is enabled by default on the Consumer Snap.
11-07-2018 09:57 PM
Hi @sriram,
Thanks for your information.
I need to combine all messages into a single consolidated file. I need to ensure the consolidated file is written successfully to azure. Here, will it read all messages for that day? or will it read single message and wait for the notification from acknowledge snap?
if it is for single message, and notification is sent to consumer snap, that message will get committed. now if there is any connection error to azure while writing the consolidated file, I could not read the today messages again.
Is there any way to read from today starting offset?
11-08-2018 03:37 PM
If you have a pipeline configured with auto-commit unchecked (on the Consumer Snap) along with the Acknowledge Snap, then messages will get acknowledged as and when they are consumed successfully one at a time.
The “Seek Type” field can be set to “Specify Offset” along with a value assigned to the “Offset” field if you want to start from a particular known offset.
Documentation reference: Confluent Kafka Consumer
11-08-2018 09:03 PM
That was not known, because we don’t know how many messages are come for one day. There are 8 partitions, messages are being distributed to partitions in a random fashion. I think we shouldn’t go for offset storing.