cancel
Showing results for 
Search instead for 
Did you mean: 

Data reconciliation solutions?

feenst
New Contributor III

One of my company's use cases for SnapLogic today is replication of data from Salesforce into internal Kafka topics for use throughout the enterprise.

There have been various instances of internal consumers of the Kafka data reporting missing records.

Investigations have found multiple causes for these data drops. Some of the causes are related to behavior that Salesforce describes as "Working As Designed".

Salesforce has recommended other replication architectures, but there are various concerns with my company about using them (license cost, platform load) ... and we might still end up with missing data.

So, we're looking into data reconciliation / auditing solutions. Are there any recommendations on a tool that can:

* Identify record(s) where the record in Salesforce does not have a matching record (e.g. same timestamp) existing in Kafka

* Generate a message containing relevant metadata (e.g. record Id, Salesforce object, Kafka topic) to be sent to a REST endpoint / message queue for reprocessing

2 REPLIES 2

Scott
Admin Admin
Admin

Salesforce Subscriber Snap & Change Data Capture (CDC) API

Our Salesforce Subscriber Snap is designed to work with Salesforce’s Change Data Capture (CDC) streaming API, which was their primary API for tracking real-time data changes for many years. However, this API has limitations in usability, reliability, and scalability, making it difficult to implement robust pipelines that ensure no events are missed. Given these constraints, reliably capturing and processing all changes can be challenging.

In recent years, Salesforce introduced a new API to address these challenges, offering better scalability and reliability. However, because it is a completely different API, we have not yet developed a new version of the Salesforce Subscriber Snap to support it. While this has been discussed internally, there are currently no immediate plans to build an updated Snap for the new API.

Handling Missed Events in Kafka

For customers using Kafka to process Salesforce data changes, it’s important to note that Kafka is not a traditional database where records can be looked up by key. Instead, Kafka operates as an append-only log of events, meaning you consume events in the order they arrive rather than retrieving a specific record at will.

The Salesforce CDC API events contain record identifiers, but they are intended to be processed in real time. If an event is missed, there is a limited window to retrieve it, and there is no built-in mechanism to look up past events by key within Kafka. Because of this, a common approach is to consume the events as they arrive and store them in a target database (e.g., Snowflake) to maintain an up-to-date view of customer records.

feenst
New Contributor III

Hi Scott,

Thanks for the response.

My organization is not using the CDC API today. We are using a REST and Bulk APIs to replicate data. Salesforce support and solution architects have recommended CDC instead of our current architecture.

Do you have more information about the usability, reliability, and scalability limitations of CDC?

Also, can you advise on what the new API is that Salesforce has introduced?

Regarding handling the missed events in Kafka, my organization uses Kafka to allow multiple internal applications to consume updates from Salesforce for objects of interest.

My team is responsible for the replication (today using SnapLogic) as well as the Kafka platform. We are interested in validating that all of the expected records from Salesforce have successfully reached Kafka.

If a record has not reached Kafka, we are interested in triggering an automated process to retrieve the missing record(s) and ensure that all of the latest, expected records have reached Kafka.

So, I am seeking input from the community of any tools that might support this use case. If the tool requires data to be extracted from Kafka and into a traditional database for the reconciliation / audit with Salesforce, this would be fine.