Snaplex Node Freezing with High Memory

One of our SL Nodes freezes on high memory and won’t start any new pipelines or accept any webhooks. The only way we’ve found to fix this is a manual reboot.

Memory will be going between 70-85%, and CPU between 10-60%… everything will be working fine… then suddenly the memory % stops changing, and CPU falls to 2-3%. It stays in this stage until we manually reboot. We aren’t receiving any alerts for this either.

Is anyone having the same problem? Any ideas on how to fix this issue?

It would be great if SL would catch this issue and automatically restart the node.

I’d recommend that you contact Support.

Hi,

This is a known issue, a memory leak, which we have been experiencing for the last couple of months and Snaplogic support is investigating. I’d reiterate the suggestion of raising a support ticket, as any additional information from other orgs will help with the diagnosis.

We are currently manually restarting groundplex nodes approx. every 2 weeks, and have considered automating the restarts, however we consider this a workaround and are looking for a fix to the root cause.

We’ve also scripted a notification when groundplex nodes pass a certain memory threshold for a sustained period, as these notifications weren’t available OOTB.

Cheers,
C.J.

2 Likes

Thanks C.J.

We have submitted a support ticket and are currently working with SL support. We’re now waiting for the next memory leak so we can get some better info to them.

You mentioned you have scripted a notification? Would you mind explaining how you did this? The “…for a sustained period” seems like the important part there. I guess we could also setup an alert if CPU usage doesn’t go over 10% for more than 1 hour or something similar.

Had similar issue and the node use to restart cause it would crash (Ran out of memory). I have a ticket for this issue.

I would be glad if snaplogic is able to free up memory on its own. (Tried to do it with jython script which didnt work either)

Hi,

Our notification script is just a simple Python script running as a cron job on one of our servers - it hits the Snaplogic Public API to gather node CPU & memory usage information. You could probably implement something similar as a Snaplogic Pipeline running on the Cloudplex if you wanted to.

Cheers,
C.J.

Any idea if this issue was resolved and SnapLogic applied a patch for this?Hitting the same issue.

Most issues causing high memory usage are Snaplex capacity issues. The pipeline workload would have to analyzed and tuned to find why memory usage is high. Things which help to avoid memory issues include

  1. Reduce concurrency in PipeExec snap
  2. Reduce max memory usage setting in Sort and In-Memory Lookup snaps
  3. Add additional nodes in the Snaplex (horizontal scaling). This is useful if the workload is parallelizable at the pipeline level. Setting the Snaplex property in the PipeExec snap will ensure that the pipeline workload gets distributed across nodes in the Snaplex.
  4. Increase heap memory available on the Snaplex nodes (vertical scaling). This can be used when the workload is not easily parallelizable at the pipeline level.

If the issue is that memory usage does not go down after pipeline execution completes, that would be unusual. Look at any custom snaps and scripts being used to see if they could be causing leaks. If the issue still persists, open a Support ticket for further investigation.

hi @akidave,
we have written our own js script in the pipeline flow, and used java collections in it.
can this lead to memory leak. as I believe there must be some GC activity taken place by snaplogic jvm itself. do we have to care about the variables that we declare in our javascripts ?
we are facing same memory leak issue, even after finishing pipeline execution the memory is not getting freed.
as you said it is unusual so does that mean our custom javascript code is leaking memory and if that is the case the variables memory clean up task should be performed by snaplogic itself.
Regards,
Aditya Kurhade.