Covert HTML to Plain Text

Question

I’m using the Email  Reader snap and finding that the messages that it is pulling only have an htmlBody and no textBody.  I need the messages properly formatted in plane text.  Does SnapLogic have an easy way to do this?
I have found and worked with several RegEx’s to strip out the HTML tags but the format that is left is not the same in most cases and can make the plane text messy.
Other options?

kristinajosifov · Answer

Hi Drew,
In the new release there is a simple way to encode or decode strings into and from HTML entities with HTML.encode()/decode() functions.
https://docs-snaplogic.atlassian.net/wiki/spaces/SD/pages/797704355/HTML+-+Encode+and+Decode+Functions
Regards,
Kristina.

vaidyarm · Answer

Hi
You Can try below function 🙂
replace( /(&lt;([^&gt;]+)&gt;)/ig, ‘’)
Thanks

dcarlson · Answer

That doesn’t do the trick. What me and the originator Drew are looking for it a way to transform this…








This
is
a
plain
text
body?

…into this…
This
is
a
plain
text
body?

In other words, we don’t need to decode… we need to strip all of the HTML tags.
Does anyone know an easy way to do this in SnapLogic? Otherwise, I the only option seems to be to build a library that knows how to do it. But someone else must have already invented that mouse trap, eh.   Thanks - Davey

dcarlson · Answer

Ha… this website is rich text, and renders the HTML tags instead of showing them to you. 😉
Let’s try this: Convert THIS… &lt;html&gt;
&lt;head&gt;
&lt;meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"&gt;
&lt;style type="text/css" style="display:none;"&gt;&lt;!-- P {margin-top:0;margin-bottom:0;} --&gt;&lt;/style&gt;
&lt;/head&gt;
&lt;body dir="ltr"&gt;
&lt;div id="divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;" dir="ltr"&gt;
&lt;p&gt;&lt;/p&gt;
&lt;div&gt;This&lt;br&gt;
is&lt;br&gt;
a&lt;br&gt;
plain&lt;br&gt;
text&lt;br&gt;
body?&lt;/div&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/div&gt;
&lt;/body&gt;
&lt;/html&gt;

… to plain text.

dmiller · Answer

When writing a post in the Commuity, there is a Preformatted text option you can use with scripts to prevent it converting.

Forum Discussion

Covert HTML to Plain Text

8 Replies

Recent Discussions

Way to lock down in Prod org to "Monitor" only access?

trace API and proxy calls

Pagination Logic Fails After Migrating from REST GET to HTTP Client Snap

Pipeline Execute Pool size

Concat values of a field based on value of another field