cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Covert HTML to Plain Text

Drew
New Contributor

Iโ€™m using the Email Reader snap and finding that the messages that it is pulling only have an htmlBody and no textBody. I need the messages properly formatted in plane text. Does SnapLogic have an easy way to do this?

I have found and worked with several RegExโ€™s to strip out the HTML tags but the format that is left is not the same in most cases and can make the plane text messy.

Other options?

8 REPLIES 8

kristinajosifov
New Contributor

Hi Drew,

In the new release there is a simple way to encode or decode strings into and from HTML entities with HTML.encode()/decode() functions.
https://docs-snaplogic.atlassian.net/wiki/spaces/SD/pages/797704355/HTML+-+Encode+and+Decode+Functio...

Regards,
Kristina.

vaidyarm
Contributor

Hi

You Can try below function ๐Ÿ™‚

replace( /(<([^>]+)>)/ig, โ€˜โ€™)

Thanks

dcarlson
New Contributor

That doesnโ€™t do the trick. What me and the originator Drew are looking for it a way to transform thisโ€ฆ

\r\n\r\n\r\n\r\n\r\n\r\n
\r\n

\r\n
This
\r\nis
\r\na
\r\nplain
\r\ntext
\r\nbody?
\r\n

\r\n


\r\n

\r\n
\r\n\r\n\r\n

โ€ฆinto thisโ€ฆ

This\r\nis\r\na\r\nplain\r\ntext\r\nbody?\r\n

In other words, we donโ€™t need to decodeโ€ฆ we need to strip all of the HTML tags.

Does anyone know an easy way to do this in SnapLogic? Otherwise, I the only option seems to be to build a library that knows how to do it. But someone else must have already invented that mouse trap, eh. Thanks - Davey

Haโ€ฆ this website is rich text, and renders the HTML tags instead of showing them to you. ๐Ÿ˜‰

Letโ€™s try this: Convert THISโ€ฆ <html>\r\n<head>\r\n<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">\r\n<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>\r\n</head>\r\n<body dir="ltr">\r\n<div id="divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;" dir="ltr">\r\n<p></p>\r\n<div>This<br>\r\nis<br>\r\na<br>\r\nplain<br>\r\ntext<br>\r\nbody?</div>\r\n<p></p>\r\n<p><br>\r\n</p>\r\n</div>\r\n</body>\r\n</html>\r\n
โ€ฆ to plain text.