Forum Discussion
Hi @ptaylor,
Appreciate you in depth response.
I agree I should use “Snowflake bulk Load” snap with Kafka Consumer`s Batch Mode option, I usually Validate pipeline and then I select the $variable this ensures correctness, but the pipeline did not work last time, I re-validated and it start to working.
So then I replace snowflake insert to Snowflake bulk insert, but then again it started to timeout
So, I thought Bulk insert might work with “Kafka Consumer`s” Batch acknowledge property but it does not work, it gives Time Out.
I think I am missing something to add, can you please see the properties what I am missing here and please add pipeline as well if you are trying it out.
Attaching all 3 pipeline, I added Extra mappers etc for debug purpose only.
Bulk insert with Batch not Working Pipeline_2022_02_22.slp (17.3 KB)
Bulk insert not Working Pipeline_2022_02_22.slp (12.1 KB)
Working Pipiline_2022_02_22.slp (11.4 KB)
- neeraj_sharma4 years agoNew Contributor II
Yes, with pipeline Execute snap, it started to work and perform very well.
Just one follow up question earlier we wanted to pull the data from different Kafka Topics and joining all together in one pipeline and inserting in one table, now looks like it’s not possible, now we need to create different pipelines for all topics, need to push the data into the tables and then create one more pipeline to join all data and put into the final table?
- ptaylor4 years agoEmployee
Ok, I’m glad that you were able to get it working well with a single topic.
Performing joins on streaming data in real time is a very advanced subject. To discuss it in detail would require much more information about your use cases and is not a discussion I can really get into here in this forum. I would consider whether it might make sense to read the data into separate Snowflake tables and then use Snowflake to do the joins. If you need true streaming functionality like windowed joins then you might look at KsqlDB or Kafka Streams. It might be possible to do the joins in SnapLogic pipelines but that can get very tricky with real-time streams that don’t end, as our Join is designed for finite input streams. One thing to consider is a hybrid approach where you use KsqlDB to do the joins of the separate Kafka topics, which will produce a new topic containing the joined data. Then use our Kafka Consumer snap to read that topic and insert into Snowflake.