cancel
Showing results for 
Search instead for 
Did you mean: 

Reading data from HTML Table

Naveen4baba
New Contributor

Hi Guys , I am trying to read a data from an HTML table. Used REST API to connect to the webpage and able to get the response. However parsing the table is really challenging.

I tried the XML parser , mapper( html.decode ) , java script , JSON parser , etc… but nothing seems to be helping me. Below is the URL ,i am trying to read data from.

Regards,
Naveen

7 REPLIES 7

tstack
Former Employee

You were on the right track with the XML Parser. It can parse the XHTML file into a JSON structure that you can manipulate with other snaps and the expression language. Here’s an example pipeline that parses the table in the URL you sent:

ReadHTMLTable_2019_10_18.slp (13.3 KB)

Note that the XML Parser takes some time to run initially since it needs to download the XHTML DTDs from w3.org and, unfortunately, they have implemented an artificial delay on that download.

The html.decode() function is for decoding HTML entities (e.g. <) and not for parsing HTML itself.

Excellent… Such a beauty…

I was almost near the last mapper. But not even thought about the array function.

But what you created is very generic and awesome…

Thanks ,Naveen

Hi @tstack , For some reason the exact pipeline is not working now suddenly. It throws error on the “Parse XML” snap. I dont see , anything changed in the source as well. Wondering , if you can help on this.

Never mind… I just found it… Fixed it