cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Adding more lookup decreasing the performance (Redshift lookup / SQL server Lookup)

kumar
New Contributor II

Hi Team,

     Need your help /solution to solve the problem. 

When we trying to add more lookupโ€™s in the pipeline performance is drastically decreasing . To load some 30K records with 3 lookup snap, it took around 30 minutes. Instead of lookup, if i use the join instead of lookup pipeline has completed within 3 minutes.

source โ†’ Redshift Lkp1 (Good performance) โ†’ Redshift Lkp2 (Slow) โ†’ Redshift Lkp3 (Very slow) โ†’ Target

Instead of Redshift , I tried with SQL server lookup and getting the same issueโ€ฆ Also, interchanged the lookupโ€™s and tried itโ€ฆ Thatโ€™s also not workingโ€ฆ

Do you have any idea, what could be the reason for this performance issue?.

Thanks,

1 REPLY 1

stephenknilans
Contributor

Well, query lookup, especially on databases, can be very high. Is there an option to cache it? Some ETL products run a query to get all candidates, and they use a binary tree to search it. It is not only a bit faster because it gets rid of so much network traffic and latency, but it also is faster because you have ONE moderate operation instead of maybe thousands or more ones that are smaller but still very expensive.

The join, works essentially the same way the binary tree method would, but requires changing your logic a little, etcโ€ฆ

Steve