The Acknowledge is failing because the metadata
is present in the input document, but it’s not in the default location under the document root ($metadata
) because of how the Join combines and restructures the data from its inputs. Try validating the pipeline, then preview the Join’s output to note where the full metadata
is located within the document. Then open the Acknowledge snap, click the suggest button for the Metadata Path
setting, and select the location of the metadata
.
Also, note the advice in the error message about holding the Shift key when you click the Validate icon. That will force all snaps to run instead of relying on cached preview data from previous validations for snaps that you haven’t edited. This is important for the way the Consumer and Acknowledge snaps interact.
As for performance, the bottleneck in your pipeline is the fact that you’re inserting one record at a time into Snowflake. You’ll have far better performance with data warehouses like Snowflake if you do bulk loading (inserting many records in one operation). Frankly, I’m not really familiar with our Snowflake snaps, but I think Bulk Load or Bulk Upsert are better suited for your use case. Check our documentation for those snaps and if you still have questions, ask them here in the Community in a new post.
However, right now your Kafka Consumer snap is configured with Acknowledge Mode
= Wait after each record
, which means the the Consumer will output a single document, then wait for the Acknowledge snap to ack that document before it outputs the next record. Obviously that’s incompatible with the requirements of a bulk loading snap. (You also have Message Count
set to 1, but I’m guessing that was for debugging purposes and you’ll set it back to the default, -1.)
Fortunately, the Kafka Consumer snap has a lot of flexibility to deal with such scenarios. At a minimum, you’ll need to change Acknowledge Mode
to Wait after each batch of records
. This lets the Consumer output many records at a time, then wait for all of those records to be acknowledged before asking the Kafka broker for more records to process. In your case, you’ll probably also need to change the Output Mode
to One output document per batch
and then use the Pipeline Execute snap to process each batch in a child pipeline. You would put the Snowflake bulk loading snap in the child pipeline; each execution of the child pipeline would process one batch of records received from Kafka. That will vastly improve your performance.
You can find an article I wrote about this to get a much better idea of how this works here:
Hope this helps.