kafka

5 Topics

Snowflake Bulk insert just stuck, do not process any data
I have seen that sometimes my Kafka to snowflake bulk insert is just stuck it does not process anything, neither it fails at the “Snowflake Bulk Insert snap”, like if there is a snowflake outage for 5 min, it gets stuck even after outgate is over, in that case, I manually restart the pipelines. is there a way to stop the pipeline if it just stuck at “Snowflake Bulk Insert snap” for X minutes? or any other better way to solve this.
neeraj_sharma
3 years ago Place Designing and Running Pipelines
1.4KViews
0likes
0Comments
Kafka consumer Skip messages when fail
Hi , We pull the data from Kafka and put it and the database, but we realized the Kafka consumer skips the data/offsets if the pipeline fails. for example: in a run Kafka consumer is supposed to read offset 3,4,5 but it pipeline fails so it skips these offsets in the next run. I tried using Kafka acknowledge snap after data inserted in the database it always time out. anybody has any solution
Solved
neeraj_sharma
4 years ago Place Designing and Running Pipelines
19KViews
0likes
16Comments
Reliable, High-Throughput Batching with the Kafka Consumer Snap
In this article just published on Medium, we take a closer look at the (Confluent) Kafka Consumer Snap’s new Output Mode setting, and how it can be used to achieve reliable, high-throughput performance for some common use cases where it’s important to process records in batches. Here are the release notes for the 423patches7900 version where this feature was introduced.
ptaylor
4 years ago Place Snap Packs
4.8KViews
0likes
4Comments
Release Notes for Confluent Kafka Snap Pack, version 423patches7900
We’re pleased to announce an interim version of our Confluent Kafka Snap Pack, 423patches7900, released today, January 11. This update contains a set of enhancements and changes which will be included and documented more fully in our forthcoming February 2021 GA release (4.24). Below is a summary of the changes in this interim release. Removed Confluent prefix from the label for all Snaps and accounts in this Snap Pack. (The pack itself is still named Confluent Kafka.) Added Wait For Full Count checkbox setting to Kafka Consumer to determine how a positive value for the Message Count setting should be interpreted. Enabled (by default): The Snap continues polling for messages until the specified count is reached. Disabled: If there are fewer messages currently available than the specified count, then the Snap consumes the available messages and terminates. Known issue: The Wait For Full Count check box is activated only when you provide a positive integer value in the Message Count field. However, it is not activated when you use an expression for Message Count even if it evaluates to a positive number. Workaround: To activate this check box, temporarily replace the Message Count expression with a positive integer, select the desired state for Wait For Full Count, and then restore the original value in the Message Count field. This has been fixed in the 4.24 release. Added support for writing and reading record headers. The Kafka Producer Snap has a new Headers table to configure the Key, Value, and Serializer for each header to be written. The Kafka Consumer Snap will read any headers present on the records it consumes. It provides two new settings to configure how the header values should be deserialized: Default Header Deserializer, and Header Deserializers for any headers which require a deserializer other than the default. Added support for writing and reading each record’s timestamp. The Kafka Producer Snap has a new Timestamp setting which can be configured to set each record’s timestamp, which is the number of milliseconds since the epoch (00:00:00 UTC on January 1, 1970). This can be set to an expression that evaluates to a long integer, a string that can be parsed as a long integer, or a date. If no expression is specified, or its value is empty, the timestamp will be set to the current time. Note that this setting is only relevant if the Kafka topic is configured with message.timestamp.type = CreateTime (which is the default). The Kafka Consumer Snap has a new checkbox setting, Include Timestamp, which defaults to disabled for backward compatibility. If enabled, the output for each record will include its timestamp in its metadata. The Kafka Producer Snap has a new checkbox setting, Output Records, to determine the format of each output document when configured with an output view. Disabled (by default): The Snap’s output includes only the basic metadata (topic, partition, offset) for each record, plus the original input document. Enabled: Each output document will contain a more complete representation of the record produced, including its key, value, headers, and timestamp. The Kafka Consumer Snap has a new setting, Output Mode, with two selections: One output document per record (the default): Every record received from Kafka has a corresponding output document. One output document per batch: Use this selection to preserve the batching of records as received from Kafka. Every poll which returns a non-empty set of records will result in a single output document containing this list of records as batch, plus batch_size and batch_index. This mode is especially useful when Auto Commit is disabled and Acknowledge Mode is Wait after each batch of records, depending on the nature of the processing between the Kafka Consumer and the Kafka Acknowledge Snaps. For an in-depth look at this new feature, see this article. Removed Account reference from Kafka Acknowledge, as this Snap does not need an account. Removed the Add 1 to Offsets setting from the Kafka Consumer. Please respond to this post with any questions about this release. Patrick Taylor Principal Software Engineer ptaylor@snaplogic.com
ptaylor
5 years ago Place Release Notes and Announcements
2.3KViews
4likes
0Comments
Move data from Database to Kafka and then to Amazon S3 Storage
Contributed by @pkona There are two pipelines in this pattern. The first pipeline extracts data from a Database and publishes to a Kafka Topic. The second pipeline consumes from the Kafka Topic and ingests into the Amazon S3 Storage by calling a third pipeline. Publish to Kafka Source: Oracle Target: Kafka Snaps used: Oracle Select, Group By N, Confluent Kafka Producer Consume from Kafka to S3 Source: Kafka Target: Pipeline Execute Snaps used: Confluent Kafka Consumer, Mapper, Pipeline Execute (calling Write file to S3 pipeline) Write file to S3 Source: Kafka Target: Amazon S3 Storage Snaps used: Mapper, JSON Formatter, File Writer Downloads Publish to Kafka.slp (5.4 KB) Consume from Kafka to S3.slp (5.4 KB) Write file to S3.slp (4.5 KB)
pkona
8 years ago Place Patterns
3.4KViews
0likes
0Comments