cancel
Showing results for 
Search instead for 
Did you mean: 

Issue with Arrays in Downstream Mappers after a Custom Snap Transformation

cclaudio
New Contributor II

We built a custom snap that performs a transformation. You will notice in the output of the custom snap in the screenshot below that “medicalEnrollment” is a json array.

image

In a downstream mapper, we attempted a transformation on this array. We have tried (1) using the sl.ensureArray() method and (2) another downstream mapper performing a transformation in the array. However, we notice that in both the #1 and #2 attempts the mappers do not detect the array, thus causing issues with the transformation.

Below is a snippet of the preview of what we expect vs actual preview.

image

We also notice that chaining a Json Formatter and a Json Parser right after the custom snap does indeed makes the mappers work correctly. Pattern shown below:

image

So my question is, is there something hidden in the JSON parser snap that allows strong typing of the array that we are missing? Anything in between?

@robin , Tagging you since I was told you might be able to help out. Thanks!

1 ACCEPTED SOLUTION

Thanks. I installed the binary snap pack and set a breakpoint in the Mapper to see the Java types used in the input, which is the output of your snap. I can see that medicalEnrollment is still a HashSet, not an ArrayList:

Screen Shot 2021-09-09 at 4.43.41 PM

I haven’t studied your source code but I can see that you’re still using HashSet in a lot of places. Don’t use that type anywhere in your Document data. It’s not compatible with any of the JSON types. Use ArrayList everywhere you need an array.

View solution in original post

20 REPLIES 20

ptaylor
Employee
Employee

Sorry, I’m not really able to follow several aspects of what you’re saying here. Let’s please clarify some of these points.

We built a custom snap that performs a transformation

Is your custom snap the one labeled “Excel Structure” in the image?
It looks like it takes an XLS file as binary input, correct?
What sort of transformation does it perform?
In your first screenshot, you’re showing the output of the custom snap, where medicalEnrollment is an array. But it’s not clear what you’re saying in the next part of the post. Does the custom snap sometimes generate output where medicalEnrollment is not an array? If so, isn’t this a bug that you could fix in the custom snap’s code? That seems preferable to fixing it using a subsequent Mapper with sl.ensureArray.

In a downstream mapper, we attempted a transformation on this array.

Can you please show/say more about this transformation? Are you using the Mapping Root functionality of the Mapper? That’s the correct way to transform the elements of an array with a Mapper.

We also notice that chaining a Json Formatter and a Json Parser right after the custom snap does indeed makes the mappers work correctly.

We would need more details about the inputs and outputs to make sense of this.

cclaudio
New Contributor II

@ptaylor Thanks for the prompt response. Let me know what other information you might need.

Is your custom snap the one labeled “Excel Structure” in the image?

Correct.

It looks like it takes an XLS file as binary input, correct?

Correct.

What sort of transformation does it perform?

In essence, it’s restructuring a flat excel structure into a more complex JSON that we need. For example, we group rows by specific criteria - some pieces of the data ends up in different arrays.

Does the custom snap sometimes generate output where medicalEnrollment is not an array? If so, isn’t this a bug that you could fix in the custom snap’s code? That seems preferable to fixing it using a subsequent Mapper with sl.ensureArray.

We have tried both in the custom snap 1) having the output be simple where only necessary data is in the output. The “medicalEnrollment” array might be missing from the output, however, if it’s present, it’s always an array 2) producing a medicalEnrollment array for every document.

Can you please show/say more about this transformation? Are you using the Mapping Root functionality of the Mapper?

Correct. We are using the Mapping Root functionality. The three screenshots below show the input, the root, and the output of the downstream mapper respectively. I’m trying to show here that the mapper did not detect the Benefit_Election_Data Array and no transformation was performed.

Input:
image

Mapper:
image

Output:
image

Frankly, this is a bit hard to follow. You were talking about medicalEnrollment and then suddenly switched to Benefit_Election_Data halfway through that last reply. The content of each array looks similar but the surrounding content looks different. But let me focus on your new examples involving Benefit_Election_Data.

It’s odd that the Mapper is showing the type of Benefit_Election_Data as object rather than array, despite the fact that it’s clearly an array in the ROOT output0 screenshot. It seems that’s the core issue you’re posting about. But does this happen consistently? Your first post seemed to show that sometimes it correctly shows as an array rather than as an object, at least when you were talking about medicalEnrollment. Is the problem that the behavior is not consistent?

Can you please post a screenshot of the Mapper that includes the Input Preview panel at the bottom? I tried to reproduce the issue you’re seeing using an abbreviated version of your data in a JSON Generator, but it’s working fine for me…

Here’s the Mapper before setting the Mapping root, where the Benefit_Election_Data is correctly shown as an array:

image

And here’s the same Mapper after setting the Mapping root to focus on the elements of the Benefit_Election_Data array, which changes the Input Schema:

image

Let me ask you this: In your custom snap, what Java class are you using for the array? We typically use ArrayList.

cclaudio
New Contributor II

Frankly, this is a bit hard to follow. You were talking about medicalEnrollment and then suddenly switched to Benefit_Election_Data halfway through that last reply. The content of each array looks similar but the surrounding content looks different. But let me focus on your new examples involving Benefit_Election_Data .

Apologies for the confusion. I’ll try to replicate the data flow to make it clearer

  1. “Excel Structure” Custom Snap is transforming an excel file. “medicalEnrollment” Array is a part of the the output
  2. “ROOT” Mapper. Transforms the base root of the data. In here, “medicalEnrollment” is transformed to $Change_Benefits_Data.Benefit_Election_Data. We expect “Benefit_Election_Data” to be the array containing the information medicalEnrollment previously had.
  3. “BENEFIT ELECTION DATA” Mapper. Transforms the “Benefit_Election_Data”.
    image

But does this happen consistently? Your first post seemed to show that sometimes it correctly shows as an array rather than as an object , at least when you were talking about medicalEnrollment . Is the problem that the behavior is not consistent?

Correct, that this is the core issue of this post. In this pipeline, it’s consistently happening whenever a mapper is chained downstream after the custom snap. If I introduce a JSON formatter and JSON Parser right after the custom snap as shown in the original post, the downstream mapper consistently behaves as we expect it to and transforms the array. In the second screenshot of the original post, I tried to show expect vs. actual behavior.

Can you please post a screenshot of the Mapper that includes the Input Preview panel at the bottom?

Below is the screenshot.
image

Let me ask you this: In your custom snap, what Java class are you using for the array? We typically use ArrayList.

We are using a HashSet. We will try converting to ArrayList and post an update tomorrow.