Forum Discussion
Are these local executions (i.e. the Snaplex property in the PipeExec is empty)?
For local executions, PipeExec will try to send input documents to the least-loaded child execution. So, an imbalance like this can occur when the child executions are able to process the inputs slightly faster than the incoming rate. I’d need to take a closer look at the execution stats of the parent and child executions to see if that’s really the case.
- jamesv7 years agoNew Contributor II
Yep, Snaplex property is empty.
There’s only one node in this specific instance, too.
I think in this case I don’t need to reuse executions, as it’s been a while since I reexamined what I was doing in the intermediate pipeline execute which doesn’t really need to be parallelized any more, so I might end up just getting rid of “reuse pipeline executions” at this point!
- tstack7 years agoFormer Employee
The uneven distribution looks like a bug and one has been filed.
I think the problem is that PipeExec is unable to determine exactly how many docs are actively being worked on in some cases and it was assuming the pipeline was not under load when it really was. One case where this can happen is with snaps that work in batches. They consume multiple documents quickly and the platform cannot tell if the snap has finished working on the document or not. So, PipeExec thinks the child is free, when really the docs have only partially been processed.