Forum Discussion
@bojanvelevski: I do understand that we closed this thread as we got the desired solution however putting a comment here as the next thing I’m trying to achieve is related to this functionality itself.
We were considering all records whose Last_Updated was less than 48 hrs
and then those records were set to update the Salesforce table.
After that, a scheduled job was enabled to run 4 times a day and as we are considering records updated within last 48 hrs, we are seeing duplicate records in each batch, I believe we will continue to see those as the criteria is less than 48 hours, is there a way we can filter out what was updated in the last batch and consider the brand new ones ONLY?
Thanking you in advance for your help 🙂
Regards,
Darsh
- bojanvelevski4 years agoValued Contributor
In order to exclude duplicates, you’ll have to use a Diff mechanism. Practically, when get records that are updated in the last 48 hours, you need to compare them with Salesforce records added in the last 48 hours by using a Diff snap.
- darshthakkar4 years agoValued Contributor
Thank you @bojanvelevski, cheked
Diff
snap, wasn’t aware of this up until now. It suggests that the input streams should be sorted, I will have to use a sorter as my entire pipeline was unsorted.- bojanvelevski4 years agoValued Contributor
Yes @darshthakkar, you’ll have to sort the streaming data on the same field you’ll differentiate on.
Related Content
- 5 years ago