expression language and expression libraries

11 Topics

Performance Optimization: Use Static Lookups Instead of Joins
Hi! I wanted to share a powerful and lightweight approach I recently implemented for using static reference data in pipelines—without needing memory-intensive joins or separate file/database reads during runtime. The Challenge In typical scenarios, we handle static or reference data (like lookup tables or code descriptions) by: Reading it from a file or database Performing a Join Snap to enrich the main data stream While effective, joins: Can be memory-heavy, especially with large datasets Add complexity to your pipeline Require both sources to be aligned in structure and timing The New Approach Instead of performing a join, we can: Store static reference data as a JSON file in SnapLogic’s SLDB Load this JSON file in an Expression Library Use filter/map function in your pipeline expressions to fetch data from JSON based on a key No joins. No file readers. Just fast in-memory lookups! Example Sample JSON file (staticData,json) [ { "code": "A1", "desc": "Alpha" }, { "code": "B2", "desc": "Beta" }, { "code": "C3", "desc": "Gamma" } ] Define in Pipeline: Usage in Pipeline: lib.static.filter(x =>x.code == $code_from_source).length > 0 ? lib.static.filter(x =>x.code == $code_from_source)[0].desc : "Unknown" This setup allows you to quickly enrich your data using a simple expression, and the same logic can be reused across multiple pipelines via the library. Benefits Faster: No join processing overhead Simpler pipelines: Fewer snaps and data dependencies Reusable: One JSON file + one function = many pipelines Memory-efficient: Especially helpful when Snaplex memory is a constraint Things to Consider SLDB file size limit: The JSON file stored in SLDB must be under 100MB (SnapLogic’s file size limit for SLDB uploads). Data updates: If your reference data changes frequently (e.g., weekly/monthly), you’ll need to build a separate job or pipeline to overwrite the SLDB file. Search performance: The filter() method checks each item one by one, which can be slow if your JSON has a lot of records. For faster lookups, consider converting the data into a key-value map. Governance: SLDB files have limited access control compared to databases. Ensure your team is aligned on ownership and update responsibility. Maintainability: JSON logic is hardcoded, so changes to structure or logic require modifying the expression library and possibly redeploying affected pipelines. I’ve found this approach especially useful for small to medium-sized static datasets where performance, simplicity, and reusability are key. If you're looking to reduce joins and streamline your pipelines, I highly recommend giving this method a try. To make it easier, I’ve attached a sample pipeline, JSON lookup file, and input CSV so you can see the setup in action. Feel free to explore, adapt, and let me know how it works for you!
athorat
6 months ago Place Designing and Running Pipelines
506Views
2likes
1Comment