Extracting file names from html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-30-2018 10:49 PM
Hi all,
I have a requirement to fetch files from a rest api service (which is http url). The output of rest-get snap is html ,below is the response we have got Rest Get snap but unable to filter out only filenames from it.
image.png1756x677 136 KB
Is there any way to extract the only filenames from the above?
TIA

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-31-2018 12:26 AM
@nshruthi
Hi, it’s looking like your rest call return raw data. Can you share your json file /entity ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-31-2018 12:44 AM
HI @Supratim,
Below is the response we are receiving from Rest_Get, entity contains the html content from which we need to extract only filenames .
Rest_Get.txt (964.6 KB)
TIA.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-31-2018 04:31 AM
HI ,
Found solution. splitting the html line by line in mapper and then using json splitter .
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2018 01:05 PM
Other possible solutions:
-
maybe a json mediaType can be requested from the web service (e.g., Accept or Content-Type headers)
-
the html might be processed as XML:
- map the “entity” field to $[‘content’] and then run through Doc->Bin Snap then XML Parser Snap.
I would prefer #1 over #2 as a lot (most?) of HTML is not well-formed.
Cheers,
