cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Extracting file names from html

nshruthi
New Contributor II

Hi all,

I have a requirement to fetch files from a rest api service (which is http url). The output of rest-get snap is html ,below is the response we have got Rest Get snap but unable to filter out only filenames from it.

0a84fe6e1ee48a5baa5831b8fae6a50935fee6ad.pngimage
image.png1756x677 136 KB

Is there any way to extract the only filenames from the above?

TIA

4 REPLIES 4

Supratim
Contributor III

@nshruthi
Hi, itโ€™s looking like your rest call return raw data. Can you share your json file /entity ?

nshruthi
New Contributor II

HI @Supratim,

Below is the response we are receiving from Rest_Get, entity contains the html content from which we need to extract only filenames .

Rest_Get.txt (964.6 KB)

TIA.

nshruthi
New Contributor II

HI ,

Found solution. splitting the html line by line in mapper and then using json splitter .

Thanks

tk42
New Contributor III

Other possible solutions:

  1. maybe a json mediaType can be requested from the web service (e.g., Accept or Content-Type headers)

  2. the html might be processed as XML:

    • map the โ€œentityโ€ field to $[โ€˜contentโ€™] and then run through Doc->Bin Snap then XML Parser Snap.

I would prefer #1 over #2 as a lot (most?) of HTML is not well-formed.

Cheers,