cancel
Showing results for 
Search instead for 
Did you mean: 

Parsing XML using a Python script returns an encoding error message

jfpelletier
Contributor

Hello all,

I'm trying to parse an XML file using a Python script to extract and return the content of some textual elements, but the script is returning this error message: "Failure: 'ascii' codec can't encode character u'\xc9' in position 8242: ordinal not in range(128)...(rest of long message not pasted)"

It seems like there is an encoding issue, and I can't seem to be able to properly read the XML content (that is encoded using UTF-8).

I tried also to use the build-in XML Parser snap, but it's returning this error message. I replaced some of the text from the XML file returned by the string "<SOME TEXT HERE>" for confidentiality:

Failure: Failed to convert xml to json, Reason: Mixed content is not supported: '<SOME TEXT HERE>', Resolution: Please check if the xml data is well formed

The XML file is valid, I can parse it and extract the data exactly as I need it in a standalone Python script.

Anyone had that issue before and know how to resolve it?

Many thanks in advance!

JF

1 ACCEPTED SOLUTION

jfpelletier
Contributor

Hey,

Looks like I've solved my own issue... In my script, I had to encode the input document using the ".encode('utf-8')" method.

I can now parse and correctly extract all the data that I need from the file.

JF

View solution in original post

1 REPLY 1

jfpelletier
Contributor

Hey,

Looks like I've solved my own issue... In my script, I had to encode the input document using the ".encode('utf-8')" method.

I can now parse and correctly extract all the data that I need from the file.

JF