cancel
Showing results for 
Search instead for 
Did you mean: 

Snaplex Node Freezing with High Memory

nsmith
New Contributor III

One of our SL Nodes freezes on high memory and won’t start any new pipelines or accept any webhooks. The only way we’ve found to fix this is a manual reboot.

Memory will be going between 70-85%, and CPU between 10-60%… everything will be working fine… then suddenly the memory % stops changing, and CPU falls to 2-3%. It stays in this stage until we manually reboot. We aren’t receiving any alerts for this either.

Is anyone having the same problem? Any ideas on how to fix this issue?

It would be great if SL would catch this issue and automatically restart the node.

8 REPLIES 8

dmiller
Admin Admin
Admin

I’d recommend that you contact Support.


Diane Miller
Community Manager

cj_ruggles
New Contributor III

Hi,

This is a known issue, a memory leak, which we have been experiencing for the last couple of months and Snaplogic support is investigating. I’d reiterate the suggestion of raising a support ticket, as any additional information from other orgs will help with the diagnosis.

We are currently manually restarting groundplex nodes approx. every 2 weeks, and have considered automating the restarts, however we consider this a workaround and are looking for a fix to the root cause.

We’ve also scripted a notification when groundplex nodes pass a certain memory threshold for a sustained period, as these notifications weren’t available OOTB.

Cheers,
C.J.

nsmith
New Contributor III

Thanks C.J.

We have submitted a support ticket and are currently working with SL support. We’re now waiting for the next memory leak so we can get some better info to them.

You mentioned you have scripted a notification? Would you mind explaining how you did this? The “…for a sustained period” seems like the important part there. I guess we could also setup an alert if CPU usage doesn’t go over 10% for more than 1 hour or something similar.

cj_ruggles
New Contributor III

Hi,

Our notification script is just a simple Python script running as a cron job on one of our servers - it hits the Snaplogic Public API to gather node CPU & memory usage information. You could probably implement something similar as a Snaplogic Pipeline running on the Cloudplex if you wanted to.

Cheers,
C.J.