cancel
Showing results for 
Search instead for 
Did you mean: 

XML Generator, Entity referenced, but not declared

alex_panganiban
Contributor

I’m having trouble using the XML Generator with data that contains special symbols and characters. All the usual suspects (<, >, &, ", ', ) have been taken care of by using the HTML notations <, >, &amp, etc., but I’m having trouble with others that I feel should work. For instance, the symbol for the registered sign, ®, should be notated as ®, however, the XML generator gives me an error stating that “the entity, reg, is referenced but not delcared.” How do I go about declaring these. I have quite a number of symbols and accented/uumlat’ed letters that I need to use. Whether I use the rendered symbol or the HTML notation, the XML Generator fails.

How can I successfully include these symbols in my XML messages so that the XML Generator doesn’t see them as errors? See pics below for examples of my issue.

image
image

25 REPLIES 25

alex_panganiban
Contributor

@mbowen Thanks for your response and it makes me feel very hopeful. I just submitted a support ticket. My ticket number is #41732. I have my fingers crossed and remain hopeful.

FYI, in my sample, the input document contains a field called “additional_romance_copy” for sku 228407. The value for this field should get placed in an output field called, “long-description,” in product-id 11110525. This is where the final results will be. In my experiments, I modified additional_romance_copy so that it contained a variety of special characters that I know my vendors use. I outlined the top 18 of them earlier in this thread. Hope this info helps.

Thanks again.

I tried to set the DOCTYPE to HTML and include the xhtml strict dtd but getting an error where DOCTYPE is not allowed XML generator snap

image

So, I tried to validate/parse the XML from an online tool… just something to explore

Without the DOCTYPE, the XML is invalid due to reg is not declared
image

Now adding the DOCTYPE, the reg and copy are now valid since it was declared from the DTD

image

@mbowen Hi Matthew. I hope you are well. Earlier this summer you offered me a solution to get past my XML HTML entity declaration issues. We basically Base64 encoded the value of the XML element that contained my HTML entity codes, passed it through the both the XML Generator and XML Formatter snaps, and then just before we wrote the XML document to file, we used a Mapper snap with the following expression in it.

$content.replace(/<(short|long)-description>(.+?)</\1-description>/g,
(_matched, prefix, b64enc) => “<”+prefix+“-description>”+Base64.decode(b64enc)+“</”+prefix+“-description>”)

I now have another element in the same XML document that I wish to do a Base64.decode() on, however, I’m having trouble understanding how your replace method and the callback function within it works. Could you help me to understand it so that I can string an additional replace on this same expression?

I’m basically trying to decode the value of the custom-attribute below, but when I try to emulate the pattern that you established, I can’t get it to work.

<custom-attribute attribute-id="productFeatures">PCFbQ0RBVEFbPHA+RmVhdHVyZXMgTGlzdDwvcD5dXT4=</custom-attribute>

Hey @alex.panganiban.guild . Hope you’ve been well too. I will look at this today.

@mbowen Good Morning, Matthew. Well, I came up with a solution, yet I still don’t understand how this expression works. I just blindly and dumbly followed the pattern more closely. Mostly the whole (_matched, prefix, b64enc) callback function escapes my comprehension. This is what I finally came up with and it seems to work. If I have any new custom attributes in the future that contain HTML entity codes, I can just add them to prefix options in this expression.

$content.replace(/<(short|long)-description>(.+?)</\1-description>/g,
(_matched, prefix, b64enc) => “<”+prefix+“-description>”+Base64.decode(b64enc)+“</”+prefix+“-description>”).replace(/(productFeatures|anyNewCustomAttributeWithEncodedHTMLAddHere)">(.+?)</custom-attribute>/g, (_matched, prefix, b64enc) => prefix + “">” + Base64.decode(b64enc) + “”)