cancel
Showing results for 
Search instead for 
Did you mean: 

Performance Optimization: Use Static Lookups Instead of Joins

athorat
New Contributor

Hi!

I wanted to share a powerful and lightweight approach I recently implemented for using static reference data in pipelines—without needing memory-intensive joins or separate file/database reads during runtime.

The Challenge

In typical scenarios, we handle static or reference data (like lookup tables or code descriptions) by:

  • Reading it from a file or database
  • Performing a Join Snap to enrich the main data stream

While effective, joins:

  • Can be memory-heavy, especially with large datasets
  • Add complexity to your pipeline
  • Require both sources to be aligned in structure and timing
The New Approach

Instead of performing a join, we can:

  • Store static reference data as a JSON file in SnapLogic’s SLDB
  • Load this JSON file in an Expression Library
  • Use filter/map function in your pipeline expressions to fetch data from JSON based on a key

No joins. No file readers. Just fast in-memory lookups!

Example

Sample JSON file (staticData,json)

[
  { "code": "A1", "desc": "Alpha" },
  { "code": "B2", "desc": "Beta" },
  { "code": "C3", "desc": "Gamma" }
]

Define in Pipeline:

athorat_0-1746730957897.png

Usage in Pipeline:

lib.static.filter(x =>x.code == $code_from_source).length > 0 ? lib.static.filter(x =>x.code == $code_from_source)[0].desc : "Unknown"

This setup allows you to quickly enrich your data using a simple expression, and the same logic can be reused across multiple pipelines via the library.

Benefits
  • Faster: No join processing overhead
  • Simpler pipelines: Fewer snaps and data dependencies
  • Reusable: One JSON file + one function = many pipelines
  • Memory-efficient: Especially helpful when Snaplex memory is a constraint
Things to Consider
  • SLDB file size limit: The JSON file stored in SLDB must be under 100MB (SnapLogic’s file size limit for SLDB uploads).
  • Data updates: If your reference data changes frequently (e.g., weekly/monthly), you’ll need to build a separate job or pipeline to overwrite the SLDB file.
  • Search performance: The filter() method checks each item one by one, which can be slow if your JSON has a lot of records. For faster lookups, consider converting the data into a key-value map.
  • Governance: SLDB files have limited access control compared to databases. Ensure your team is aligned on ownership and update responsibility.
  • Maintainability: JSON logic is hardcoded, so changes to structure or logic require modifying the expression library and possibly redeploying affected pipelines.

I’ve found this approach especially useful for small to medium-sized static datasets where performance, simplicity, and reusability are key. If you're looking to reduce joins and streamline your pipelines, I highly recommend giving this method a try.

To make it easier, I’ve attached a sample pipeline, JSON lookup file, and input CSV so you can see the setup in action. Feel free to explore, adapt, and let me know how it works for you!

1 REPLY 1

ptaylor
Employee
Employee

Very good post! Thank you.