Snaplex-based scheduler for scheduled tasks - how to know when snaplex node is synchronized with control plane?

We have a two-node Groundplex. The nodes are Linux (RHEL). Every month we patch the OS on these nodes. We have written the following scripts to help facilitate the patching process

Script to enter maintenance mode:

  1. Use Snaplex Management API to put node into maintenance mode.
  2. Call Snaplex Monitoring API to get the status of the node. Keep calling the API until the node status is not “running”.
  3. Call Pipeline Monitoring API to get the list of “running” pipelines. Keep calling the API until there are no running pipelines.

Script that we use when exiting maintenance mode:

  1. Call Snaplex Monitoring API to get the status of the node. Keep calling the API until the node status = down or running. (We do this to ensure that the node has restarted after the reboot).
  2. Use Snaplex Management API to take the node out of maintenance mode.
  3. Call Snaplex Monitoring API to get the status of the node. Keep calling the API until the node status = “running”.

The overall patching process is executed by the Linux admin and goes like this:

  1. Invoke script to put node1 into maintenance mode.
  2. Apply OS patch to node1 and reboot node1.
  3. SnapLogic automatically starts on node1 after the reboot because we have it configured to do so using the init.d utility.
  4. Invoke script to take node 1 out of maintenance mode.
  5. As soon as node 1 comes up the script to put node2 into maintenance mode is invoked.
  6. Apply OS patch to node2 and reboot node2.
  7. SnapLogic automaticaly starts on node2 after the reboot because we have it configured to do so using the init.d utility.
  8. Invoke script to take node 2 out of maintenance mode.

We executed this patching process and node1 was rebooted and the JCC restarted at 8/1 5:56:53PM. node2 was rebooted and the JCC restarted at 8/1 6:01:03PM.

We have three test scheduled tasks that are scheduled to run every fifteen minutes.

We observed the following behavior:

test_scheduler_1 task - executed at 5:45PM as expected, but didn’t execute per the schedule again until the 6:30 schedule (6:44). Two scheduled events were missed (6:00 and 6:15).

test_scheduler_2 task - executed at 5:45PM as expected, but didn’t execute per the schedule again until the 7:30 schedule (7:37). Six scheduled events were missed (6:00, 6:15, 6:30, 6:45, 7:00, and 7:15).

test_scheduler_3 task - executed at 5:45PM as expected, but didn’t execute per the schedule again until the 7:00 schedule (7:01). Four scheduled events were missed (6:00, 6:15, 6:30, and 6:45).

Even though we always had a node up and running in the Snaplex during the patching process, since both nodes were rebooted, it appears that scheduled events were missed because the scheduling information had not been pushed from the control plane to the snaplex nodes. If this is the case, is there an API that we can use to determine the state of the snaplex node scheduling information? We’d like to modify our patching scripts so that we don’t start patching node2 until the JCC on node1 is up AND the scheduling information on that node is synchronized with the control plane. Can you suggest such a way to do that?

1 Like

Do you have the Snaplex based scheduler enabled? It doesn’t make sense to be if you are seeing that kind of delay to schedules.

Hi @cstewart. We have snaplex-based scheduler enabled.