cancel
Showing results for 
Search instead for 
Did you mean: 

XML Generator, Entity referenced, but not declared

alex_panganiban
Contributor

I’m having trouble using the XML Generator with data that contains special symbols and characters. All the usual suspects (<, >, &, ", ', ) have been taken care of by using the HTML notations <, >, &amp, etc., but I’m having trouble with others that I feel should work. For instance, the symbol for the registered sign, ®, should be notated as ®, however, the XML generator gives me an error stating that “the entity, reg, is referenced but not delcared.” How do I go about declaring these. I have quite a number of symbols and accented/uumlat’ed letters that I need to use. Whether I use the rendered symbol or the HTML notation, the XML Generator fails.

How can I successfully include these symbols in my XML messages so that the XML Generator doesn’t see them as errors? See pics below for examples of my issue.

image
image

25 REPLIES 25

Wonderful, but yikes. Ok, let me parse your expression and we’ll explain together.

@mbowen Haha, I know it looks scary, but hey, when you originally came up with this solution, I thought it was brilliant and imaginative…and I still do. It works wonderfully and it solved the HTML entity problem I was having with XML. And we were able to do it without having to complicate doctypes and DTD’s. So kudos to you. Now I just wish I comprehended how it works.

I will add, I had to place the element’s value in a CDATA wrap in order to get the XML document to import into Salesforce successfully. Before I did that, the Salesforce XML validation process was also flagging my undeclared HTML entity codes and rejecting my import.

What I ended up finally importing, after Base64 decoding my value with the replacement expression, looked something like what’s below. I did the same wrap thing with the short and long descriptions that we worked on earlier this summer as well.

<custom-attribute attribute-id=“productFeatures”><![CDATA[<p>Features List &trade;</p>]]></custom-attribute>

@alex.panganiban.guild

Your final xml payloads were pretty big (lots of text), so I think we want to be as precise as possible when replacing text to prevent accidentally replacing something. So, we may want to tighten up the custom-attribute match. Here’s a slightly different version from what you have:

$input
    .replace(/<(short|long)-description>(.+?)<\/\1-description>/g,
      (_matched, prefix, b64enc) => "<" +prefix+ "-description>" +Base64.decode(b64enc)+ "</" +prefix+ "-description>")
    .replace(/<custom-attribute attribute-id="(productFeatures|attr-x)">(.+?)<\/custom-attribute>/g,
      (_matched, attrid, b64enc) => '<custom-attribute attribute-id="' +attrid+ '">' +Base64.decode(b64enc)+ '</custom-attribute>')

Javascript regular expressions are “greedy” by default so will try to match as much as possible. We want to match the nearest closing element, so want to use a reluctant quantifier which is what the question mark will do. That is,

(.+)  // greedy
(.+?) // reluctant (not greedy)

I think we need to escape the forward slash in the regex, for both replace methods. (ie, /). I put a grouping in the attribute-id with “attr-x” meant to convey your anyNewCustomAttributeWithEncodedHTMLAddHere. You could add other attributes as needed (pipe delimited).

This appears to work. I’ve attached my test pipeline which contains a json payload in a JSON generator from an earlier payload that you had shared with me.

Thanks for sharing about having to CDATA encode data for Salesforce. Sheesh. So, many transformations.

gamestop-js-replace-expr
decode-product-description-v3_2021_09_29.slp (7.1 KB)

Yes!! This is so much better. I 100% agree with you on the tightening up of the search criteria. That’s initially what I attempted to do, but kept failing miserably at. I like your solution much better. Looking at your version, I think I might have been overthinking syntax and adding in too many escape backslashes in my original attempts.

I will go through your solution and get familiar with it. I still don’t understand how the (_matchid, prefix, b4enc) function is working as I don’t see any explicit function being called, but only parameters and arguments. I tried Googling this so that I could learn about how you built this expression, but I had a hard time finding something that resembled this scenario. Maybe you can try and explain it to me if it won’t take too much of your time.

Thanks so much for refining my work and helping to make me a better developer.

Alex

@alex.panganiban.guild

Here is some documentation on the String replace function (our docs). If you google “Javascript String replace” you’ll get many hits.

https://docs-snaplogic.atlassian.net/wiki/spaces/SD/pages/1439357/String+Functions+and+Properties#St...

The replacer function that we pass to replace() will be passed these parameters:

  • The first parameter will be the matched substring (ex: _matched).

  • The parameters after that will correspond to any parenthesized groupings. We just so happen to define two groupings for each regex, so we know that we’ll be passed two parameters for each of these expressions. We could call these whatever, p1 and p2 for example. We used more meaningful names.

Per the docs, we will also be passed an offset and the original string, if added those parameters. I haven’t tested this. Original string would be quite large. I’m not a JavaScript expert, and continue to be impressed with the power of SnapLogic’s expression language.