Change Data Capture from Workday
This is often a requirement from a variety of Workday customers to extract data within a time range i.e. extract data that was changed 7 days ago or 2 days back or even within the last day. Classic examples are: Terminations within last 90 days or 30 days New hires for the next 7 days who were entered within the last 7 days All active and only terminations within last pay period or 15 days It is very easy to accomplish this. Attached is a sample pipeline that extracts worker data whose preferred name changed since last July 2016. Obviously you can parameterize these date ranges and make this a batch job that runs every night or morning. The most important thing is to understand that Workday provides something called Transaction Log which keeps a track of all transaction changes within workday. Note that Workday keeps changes as part of transactions and every change has an underlying transaction which is basically identified by Transaction Type Listed below are some things you can refer for easy development of integrations. WKD_TRANSACTION_LOG.SLP - a simple pipeline that gives changes in preferred names since Jul-2016 Workday Community Documentation Reference: Workday Resource Center - Sign In Listed below is the mapper which tells the workday read snap on what changes to extract Attached is a sample pipeline. WKD_Transaction_log.slp (6.6 KB)4.6KViews3likes0CommentsChange Data Capture With SnapLogic
I frequently get asked by customers - does SnapLogic support Change Data Capture (CDC)? My response usually is - what does CDC mean to you? What are you looking to do? While there are a few organizations that are looking to do CDC in the traditional sense (log scraping etc) many of these customers are looking to optimize data movement by only moving records that have changed since the last move. Especially in the context of SFDC, Workday or other SAAS applications, CDC in the traditional sense implemented via techniques like log scraping makes limited sense. As a result, SnapLogic’s approach outlined below becomes especially relevant in these contexts. Our approach is to enable Query based CDC, that assumes the existence of a ‘last updated’ timestamp field in the source data that we can compare against. Below is an example of query based CDC implemented within a SnapLogic pipeline, that extracts Account objects from Salesforce that have changed since the last move, and loads into SQL Server. The subsequent pipeline implements the reverse data movement - from SQL Server to Salesforce. Part 1: Synchronize Salesforce to SQL Server: This pipeline maintains the last read timestamp in a file on SLDB. The pipeline flow starts by reading the timestamp along the top path of the pipeline, and running a few sanity checks on the timestamp (data type, null value etc). It then makes a copy of the timestamp and passes into SFDC account read (to be passed through downstream with the results of the read). Along the top path, the pipeline then captures the current timestamp, formats and writes back to the timestamp file as the new most recent CDC time. Along the bottom path, the timestamp is compared with a last_modified field in the SFDC account object which captures the last time when a given object changed. If the last_modified value is greater than the value of our timestamp, we keep the record, else discard it. At the output of the filter, we now have a list of only the records that have changed since the last time we moved data from SFDC to SQL Server. Part 2: Synchronize SQL Server to Salesforce The reverse pipeline that implements timestamp based CDC from SQL Server to salesforce is shown below. The logic here is identical to before. These pipelines can be run either in a periodic polling mode, or can be triggered based on some event in the source system. Pipelines explained in this topic: Sync SS-SF w Timestamp_2016_07_15 (1).slp (16.0 KB) Sync SF-SS w Timestamp_2016_07_15.slp (15.8 KB)9.4KViews2likes5Comments