Issue with Arrays in Downstream Mappers after a Custom Snap Transformation

We built a custom snap that performs a transformation. You will notice in the output of the custom snap in the screenshot below that “medicalEnrollment” is a json array.

In a downstream mapper, we attempted a transformation on this array. We have tried (1) using the sl.ensureArray() method and (2) another downstream mapper performing a transformation in the array. However, we notice that in both the #1 and #2 attempts the mappers do not detect the array, thus causing issues with the transformation.

Below is a snippet of the preview of what we expect vs actual preview.

We also notice that chaining a Json Formatter and a Json Parser right after the custom snap does indeed makes the mappers work correctly. Pattern shown below:

image

So my question is, is there something hidden in the JSON parser snap that allows strong typing of the array that we are missing? Anything in between?

@robin , Tagging you since I was told you might be able to help out. Thanks!

Sorry, I’m not really able to follow several aspects of what you’re saying here. Let’s please clarify some of these points.

We built a custom snap that performs a transformation

Is your custom snap the one labeled “Excel Structure” in the image?
It looks like it takes an XLS file as binary input, correct?
What sort of transformation does it perform?
In your first screenshot, you’re showing the output of the custom snap, where medicalEnrollment is an array. But it’s not clear what you’re saying in the next part of the post. Does the custom snap sometimes generate output where medicalEnrollment is not an array? If so, isn’t this a bug that you could fix in the custom snap’s code? That seems preferable to fixing it using a subsequent Mapper with sl.ensureArray.

In a downstream mapper, we attempted a transformation on this array.

Can you please show/say more about this transformation? Are you using the Mapping Root functionality of the Mapper? That’s the correct way to transform the elements of an array with a Mapper.

We also notice that chaining a Json Formatter and a Json Parser right after the custom snap does indeed makes the mappers work correctly.

We would need more details about the inputs and outputs to make sense of this.

@ptaylor Thanks for the prompt response. Let me know what other information you might need.

Is your custom snap the one labeled “Excel Structure” in the image?

Correct.

It looks like it takes an XLS file as binary input, correct?

Correct.

What sort of transformation does it perform?

In essence, it’s restructuring a flat excel structure into a more complex JSON that we need. For example, we group rows by specific criteria - some pieces of the data ends up in different arrays.

Does the custom snap sometimes generate output where medicalEnrollment is not an array? If so, isn’t this a bug that you could fix in the custom snap’s code? That seems preferable to fixing it using a subsequent Mapper with sl.ensureArray.

We have tried both in the custom snap 1) having the output be simple where only necessary data is in the output. The “medicalEnrollment” array might be missing from the output, however, if it’s present, it’s always an array 2) producing a medicalEnrollment array for every document.

Can you please show/say more about this transformation? Are you using the Mapping Root functionality of the Mapper?

Correct. We are using the Mapping Root functionality. The three screenshots below show the input, the root, and the output of the downstream mapper respectively. I’m trying to show here that the mapper did not detect the Benefit_Election_Data Array and no transformation was performed.

Input:

Mapper:

Output:

Frankly, this is a bit hard to follow. You were talking about medicalEnrollment and then suddenly switched to Benefit_Election_Data halfway through that last reply. The content of each array looks similar but the surrounding content looks different. But let me focus on your new examples involving Benefit_Election_Data.

It’s odd that the Mapper is showing the type of Benefit_Election_Data as object rather than array, despite the fact that it’s clearly an array in the ROOT output0 screenshot. It seems that’s the core issue you’re posting about. But does this happen consistently? Your first post seemed to show that sometimes it correctly shows as an array rather than as an object, at least when you were talking about medicalEnrollment. Is the problem that the behavior is not consistent?

Can you please post a screenshot of the Mapper that includes the Input Preview panel at the bottom? I tried to reproduce the issue you’re seeing using an abbreviated version of your data in a JSON Generator, but it’s working fine for me…

Here’s the Mapper before setting the Mapping root, where the Benefit_Election_Data is correctly shown as an array:

And here’s the same Mapper after setting the Mapping root to focus on the elements of the Benefit_Election_Data array, which changes the Input Schema:

Let me ask you this: In your custom snap, what Java class are you using for the array? We typically use ArrayList.

Frankly, this is a bit hard to follow. You were talking about medicalEnrollment and then suddenly switched to Benefit_Election_Data halfway through that last reply. The content of each array looks similar but the surrounding content looks different. But let me focus on your new examples involving Benefit_Election_Data .

Apologies for the confusion. I’ll try to replicate the data flow to make it clearer

  1. “Excel Structure” Custom Snap is transforming an excel file. “medicalEnrollment” Array is a part of the the output
  2. “ROOT” Mapper. Transforms the base root of the data. In here, “medicalEnrollment” is transformed to $Change_Benefits_Data.Benefit_Election_Data. We expect “Benefit_Election_Data” to be the array containing the information medicalEnrollment previously had.
  3. “BENEFIT ELECTION DATA” Mapper. Transforms the “Benefit_Election_Data”.
    image

But does this happen consistently? Your first post seemed to show that sometimes it correctly shows as an array rather than as an object , at least when you were talking about medicalEnrollment . Is the problem that the behavior is not consistent?

Correct, that this is the core issue of this post. In this pipeline, it’s consistently happening whenever a mapper is chained downstream after the custom snap. If I introduce a JSON formatter and JSON Parser right after the custom snap as shown in the original post, the downstream mapper consistently behaves as we expect it to and transforms the array. In the second screenshot of the original post, I tried to show expect vs. actual behavior.

Can you please post a screenshot of the Mapper that includes the Input Preview panel at the bottom?

Below is the screenshot.

Let me ask you this: In your custom snap, what Java class are you using for the array? We typically use ArrayList.

We are using a HashSet. We will try converting to ArrayList and post an update tomorrow.

1 Like

I think this is the issue. If you want this object to be treated as a JSON array, its class needs to implement Java’s List interface. HashSet implements Collection, but not List, so you’ll get a weird mix of behaviors from the subsequent processing, as some of it will work fine with any Collection, but some aspects like the Mapper will only recognize a List as an array.

@ptaylor, We were not able to make it work by switching to ArrayList. We will create another custom snap simplified to bare bones and use an ArrayList. Are you aware of any other requirements that the Mapper might need to interpret the Arrays correctly?

That’s unfortunate. I am not. This will be difficult to diagnose without a way for us to reproduce the issue. Screenshots and text descriptions won’t allow us to debug. Is there any way you can help us to reproduce this problem?

I suggest creating a support ticket and providing us with a link to a test pipeline that we can execute.

1 Like

I suggest creating a support ticket and providing us with a link to a test pipeline that we can execute.

@ptaylor I already have created a ticket with support (Ticket #43346). I will follow up there with the details.

I’ve looked at the ticket and can see there’s a pipeline link. Can you please provide me with the permission to execute that pipeline? My username is ptaylor@snaplogic.com.

I see two different versions of the same custom snap pack in your /shared directory. They both appear to implement the same set of three snaps, including “Excel Structure”. One of them was updated today, but the other was last updated two days ago. I’m wondering if the pipeline you’re testing might be using the older one rather than the one you just updated to output an ArrayList. I suggest removing the older one to ensure it’s no longer being used if that’s your intention.

@ptaylor , I have granted you access to the org and created a copy for you. Our newest snap is com-snapit-excelstructure_1-0001 and that’s the one that’s being used at the moment.

Thanks, but where can I find the copy I can access?

@ptaylor Our organization is MercerDigitalDev. The pipeline is located in Workbook v3 - Projects/0000 - Snaplogic Testing. Either of the pipelines in that folder would work.

When I try to execute either of those pipelines the first snap fails. I think it needs an account.

@ptaylor Apologies for the incomplete copy. I uploaded the dummy file to the folder and linked it.

Would you be willing to share the binary snap pack for your custom snaps so I can run this on my local development plex and see what’s going on? If so: ptaylor@snaplogic.com

@ptaylor Will do!

Thanks. I installed the binary snap pack and set a breakpoint in the Mapper to see the Java types used in the input, which is the output of your snap. I can see that medicalEnrollment is still a HashSet, not an ArrayList:

I haven’t studied your source code but I can see that you’re still using HashSet in a lot of places. Don’t use that type anywhere in your Document data. It’s not compatible with any of the JSON types. Use ArrayList everywhere you need an array.

2 Likes