Reading data from HTML Table

Hi Guys , I am trying to read a data from an HTML table. Used REST API to connect to the webpage and able to get the response. However parsing the table is really challenging.

I tried the XML parser , mapper( html.decode ) , java script , JSON parser , etc… but nothing seems to be helping me. Below is the URL ,i am trying to read data from.


You were on the right track with the XML Parser. It can parse the XHTML file into a JSON structure that you can manipulate with other snaps and the expression language. Here’s an example pipeline that parses the table in the URL you sent:

ReadHTMLTable_2019_10_18.slp (13.3 KB)

Note that the XML Parser takes some time to run initially since it needs to download the XHTML DTDs from and, unfortunately, they have implemented an artificial delay on that download.

The html.decode() function is for decoding HTML entities (e.g. <) and not for parsing HTML itself.

1 Like

Excellent… Such a beauty…

I was almost near the last mapper. But not even thought about the array function.

But what you created is very generic and awesome…

Thanks ,Naveen