Forum Discussion

nshruthi's avatar
nshruthi
New Contributor II
7 years ago

Extracting file names from html

Hi all,

I have a requirement to fetch files from a rest api service (which is http url). The output of rest-get snap is html ,below is the response we have got Rest Get snap but unable to filter out only filenames from it.


image.png1756x677 136 KB

Is there any way to extract the only filenames from the above?

TIA

4 Replies

  • Supratim's avatar
    Supratim
    Contributor III

    @nshruthi
    Hi, it’s looking like your rest call return raw data. Can you share your json file /entity ?

  • nshruthi's avatar
    nshruthi
    New Contributor II

    HI ,

    Found solution. splitting the html line by line in mapper and then using json splitter .

    Thanks

    • tk42's avatar
      tk42
      New Contributor III

      Other possible solutions:

      1. maybe a json mediaType can be requested from the web service (e.g., Accept or Content-Type headers)

      2. the html might be processed as XML:

        • map the “entity” field to $[‘content’] and then run through Doc->Bin Snap then XML Parser Snap.

      I would prefer #1 over #2 as a lot (most?) of HTML is not well-formed.

      Cheers,