cancel
Showing results for 
Search instead for 
Did you mean: 

Order of reponses when using pagination with http client

janwyl
New Contributor

Hello!

I need to fetch a large number of values from an API and retrieve them sorted. The API allows me to specify that the results are sortef and the pagination appears to be working fine. However, I have noticed that I sometimes get errors downstream because the results are in fact not sorted correctly.

I have not fully analyzed the problem but one possible theory I have is that the http client does not wait for a reply before sending the request for the next page. This would mean that a page of results could be received out of order.

Can anyone confirm whether the paginated responses to the http client are guaranteed to be in the same order as the requests?

Thanks very much!

6 REPLIES 6

janwyl
New Contributor

Sorry I should have been clear that I have to use a POST to get the sorted results from the API and specify the sort parameters in the request body.

As a result I have been using this model https://docs-snaplogic.atlassian.net/wiki/spaces/SD/pages/3097822264/Pagination

So it's not really a question about the http client snap I guess. My question is whether this model returns results in the order of the requests.

koryknick
Employee
Employee

@janwyl - The HTTP Client and REST Post snaps are both synchronous, so they won't call for the next page until the page it receives has been passed to the next snap.  My guess is that the API you're calling is using a different collation sequence than the one SnapLogic uses.  There are similar known issues when reading from some databases, such as Microsoft SQL Server, which uses a slightly different collation sequence and sorts certain characters in a different order than SnapLogic expects.  You are typically able to change the collation sequence at the session or query level in most databases.  Unfortunately, you may not be able to do the same with the API.

My recommendation is to throw a Sort snap in after splitting out the API results to ensure proper ordering for SnapLogic.

Hope this helps!

Thanks @koryknick 

Good to have the confirmation on thesynchronous nature of HTTP Client and REST POST snaps.

However I would be surprised if sort collation was the problem. One example is where 3210448-USM18 came before 3210448-NYR6 in a supposedly ascending sort. That would suggest a sort collation that puts U before N which seems unlikely.

As I mentioned above, the model I am using is the one given here https://docs-snaplogic.atlassian.net/wiki/spaces/SD/pages/3097822264/Pagination. (Because as per that scenario I need to send sort criteria in the body of the POST for each call.)

That model does not really use the pagination functionality of the HTTP Client snap. Instead it creates a list of parameters to send, sends that list to an HTTP Client snap, and then combines the responses using a Union snap.

I note here https://community.snaplogic.com/t5/designing-and-running-pipelines/union-preserve-source-data-sequen... that the Union snap doesn't guarantee preservation of order. Is it possible that in my case the responses are coming back from the API in an order different to the sending order (perfectly possible because of varying latency), and then the response are being presented in that order?

If that is the case, then I think the proposed model has a serious flaw. The idea as stated is to be able to specify e.g. sort criteria in the POST body, but if the use of the union means the results aren't necessarily sorted then that's no good.

In my case the whole point of specifying the sort in the API criteria is so that I can avoid using a Sort in the pipeline. I need to do a join later on a large volume and I run into memory issues if I have to sort.

Any thoughts ? Thanks for your input - much appreciated !

koryknick
Employee
Employee

@janwyl - Actually the example you gave almost proves to me that it's the sort collation sequence - the API is sorting the hyphen character differently than SnapLogic.  One quick check you can perform is to Copy the stream and save the results after the API, then run the other side of the Copy through a Sort snap and save those results and compare for yourself.