02-13-2022 11:30 PM
Hi all,
I am trying to read a html file using the file reader where the file is in .txt format. I cannot use the third party libraries in the script snap to parse it because I am on cloud plex. Please let me know if anyone knows as to how this can be done.
Regards,
Lidiya
02-14-2022 01:44 AM
Hi @Lidiya_Thomas ,
Thanks for the sample input.
What would you like to have on output?
02-14-2022 03:04 AM
Hi @marjan.karafiloski thanks for helping, I want all the data in the form of json. Please do let me know if you have got any ideas on how that can be possible.
02-14-2022 12:28 AM
Hey @Lidiya_Thomas,
I think this is very complex request, especially if the HTML contains various scripts and everything. You can try and investigate some APIs. I did check one, manually on the web page and it works:
Keep in mind to read about security terms and policies.
Other way would be to use a script. Check out this link which leads to a Python (Jython) library documentation for parsing HTML:
I surely hope that you’ll find this helpful,
Bojan