cancel
Showing results for 
Search instead for 
Did you mean: 

XML Generator, Entity referenced, but not declared

alex_panganiban
Contributor

I’m having trouble using the XML Generator with data that contains special symbols and characters. All the usual suspects (<, >, &, ", ', ) have been taken care of by using the HTML notations <, >, &amp, etc., but I’m having trouble with others that I feel should work. For instance, the symbol for the registered sign, ®, should be notated as ®, however, the XML generator gives me an error stating that “the entity, reg, is referenced but not delcared.” How do I go about declaring these. I have quite a number of symbols and accented/uumlat’ed letters that I need to use. Whether I use the rendered symbol or the HTML notation, the XML Generator fails.

How can I successfully include these symbols in my XML messages so that the XML Generator doesn’t see them as errors? See pics below for examples of my issue.

image
image

25 REPLIES 25

I still need to investigate this. For the Formatter, do you check the “Format as canonical XML” checkbox?

Hi, no I am not checking “Format as canonical XML.” I do check “Strict XSD Output.”

Here are my results of testing entity reference escaping.

XML Generator (Escape Special Characters: unchecked)
    Input: &#174;
    Output: Registered Sign (decimal code point: 174, utf-8 hex bytes: C2 AE)
...
XML Formatter: 
    Input: Registered Sign
    Output: Registered Sign

Running pipeline with “escape” checked.

XML Generator (Escape Special Characters: checked)
    Input: &#174;
    Output: &amp;#174; 
...
XML Formatter: 
    Input: &#174
    Output: &amp;#174

As you know, if we try to use the ® reference, XML Generator will fail with “SAXParseException The entity ‘reg’ was referenced, but not declared”. However, if check “escape special chars”, I get this.

XML Generator (Escape Special Characters: checked)
    Input: &reg;
    Output: &amp;reg; 
...
XML Formatter: 
    Input: &reg;
    Output: &amp;reg;

Admittedly, the escaping and transformations are a bit tricky in spots. I assume you would like to see &reg; or &#174; output from the XML Formatter? I can attach my sample pipeline after updating wsdl endpoint if helpful. Where is it failing for you and what you would like to see.

Hi Matthew

Yes, your results matched what I’m experiencing too.

Ideally, I would like to see &reg;, &#174, and &#xAE go into the XML Generator (with Escape Special Characters unchecked) and come out in the exact same format, the same way as &lt;, &gt;, &apos;, and &quot; do. The same goes for the XML Formatter…I want to see any input that has been pre-escaped come out in the same format as it went in. This would make the behavior of the snaps consistent for all escaped codes.

As a retailer, our vendors pass to us product descriptions that are HTML formatted and friendly. I am attempting to place these product descriptions in XML messages so that they can be imported into our Salesforce hosted website and mobile app. Aside from <, >, ', ", and ®, I have around 17-20 additional escape codes that I need to use, however, the way the XML Generator and Formatters are working now, I am unable to do this.

Can I request for a future release that these snaps be changed so that the behavior is consistent for all escape codes when the Escape Special Characters option is turned off, or how would I go about making a custom snap to behave this way? Without a change like this, I’m pretty much dead in the water.

Thanks, Alex

p.s. Answering your question, where is it failing for me?
When I use &reg; as input to either the Generator or the Formatter, they both fail due to “reg” not being declared.
When I use &#174; and &#xAE;, they go through the Generator snap successfully, however, as you’ve experienced, they come out of Generator snap as the registered sign. Downstream, when I am running this data through the XML Formatter (in preparation to do a file write), the Formatter fails because it does not like the registered sign.
I cannot use the Escape Special Characters option because I have so many codes that I need to use and this option only accounts for the less than sign, the greater than sign, the apostrophe, and quotation marks. No other symbols or special characters are recognized by this option.

Sorry. Got caught up with another issue today. I will reply on Monday (6/7). Have a nice weekend.