01-02-2018 08:20 AM
One of our SL Nodes freezes on high memory and won’t start any new pipelines or accept any webhooks. The only way we’ve found to fix this is a manual reboot.
Memory will be going between 70-85%, and CPU between 10-60%… everything will be working fine… then suddenly the memory % stops changing, and CPU falls to 2-3%. It stays in this stage until we manually reboot. We aren’t receiving any alerts for this either.
Is anyone having the same problem? Any ideas on how to fix this issue?
It would be great if SL would catch this issue and automatically restart the node.
01-02-2018 11:07 AM
I’d recommend that you contact Support.
01-03-2018 01:57 PM
Hi,
This is a known issue, a memory leak, which we have been experiencing for the last couple of months and Snaplogic support is investigating. I’d reiterate the suggestion of raising a support ticket, as any additional information from other orgs will help with the diagnosis.
We are currently manually restarting groundplex nodes approx. every 2 weeks, and have considered automating the restarts, however we consider this a workaround and are looking for a fix to the root cause.
We’ve also scripted a notification when groundplex nodes pass a certain memory threshold for a sustained period, as these notifications weren’t available OOTB.
Cheers,
C.J.
01-03-2018 02:29 PM
Thanks C.J.
We have submitted a support ticket and are currently working with SL support. We’re now waiting for the next memory leak so we can get some better info to them.
You mentioned you have scripted a notification? Would you mind explaining how you did this? The “…for a sustained period” seems like the important part there. I guess we could also setup an alert if CPU usage doesn’t go over 10% for more than 1 hour or something similar.
01-10-2018 04:26 PM
Hi,
Our notification script is just a simple Python script running as a cron job on one of our servers - it hits the Snaplogic Public API to gather node CPU & memory usage information. You could probably implement something similar as a Snaplogic Pipeline running on the Cloudplex if you wanted to.
Cheers,
C.J.