Forum Discussion

andre_mangatal's avatar
andre_mangatal
New Contributor
7 years ago

Help Using Group By!

Dear Community

I need your help with understanding the group by in the attached simple pipeline.

I am supplying the Json data and sorting and grouping it by ID as in the picture but for each Group, i.e. ID. Then I need to first determine if a type of address exists e.g. “BUS” business or “RES” residential and then transform, e.g. via a mapper, to output both the business and residential addresses to Mapper variables $Business and $Residential output as one document/record

Thank you very much for your help in advance,
Andre M

13 Replies

  • This is the groupby. I would like to know what I need to add in the mapper or router snaps please.

  • tstack's avatar
    tstack
    Former Employee

    The GroupBy snap will collect input documents into arrays inside of a single output document. Once that is done, you’ll need to use the expression language to manipulate the arrays, so placing a Router after the GroupBy is probably not going to do what you want. If you want to use snaps to do the manipulation, you’ll need to move the Router to the head of the pipeline. I’m going to attach an example pipeline that shows both approaches: using the expression language and using the Router.

    Expressions

    There’s a few different ways to transform the addresses in the expression language in a Mapper snap. I’ll mention a couple here:

    You can use the match operator with a pattern for each type of address. For example, to extract the “BUS” address and assign it to the variable Value you would use the following and map it to the $Business property:

    match $group {
        [..., { Address: "BUS", Value }, ...] => Value
    }
    

    To handle the residential address, you’d make a new mapping and copy the above expression and replace “BUS” with “RES”. Note that if there was no address of a particular type, the match will return null.

    Another approach would be to transform the grouped address array into an array of key/value pairs and then construct a new object using the Object.extend() method. The Array.map() method can be used to transform each element in the array. In this case, we’re looking to create a pair with the desired property name mappings and the value. Here’s the full expression:

    {}.extend($group.map(elem =>
      [
          match elem.Address {
             "BUS" => "Business",
             "RES" => "Residential"
          },
          elem.Value
      ]))
    

    Note that in this approach, if an address type is missing, the property will not show up in the output document.

    Snaps (and expressions)

    If you want to use a Router to handle the different types of addresses, you’ll need to put that before the Sort and then use a Union to bring all the documents back into the same stream. Then, after the GroupBy, you’ll need to use the Object.extend() method again to merge the elements of the $group array into a single object.

    {}.extend(...$group)
    

    Note that you need to use the spread operator (…) to pass the elements of the array as the arguments to the function. Otherwise, the array will be treated as the value to be merged into the object with the element indexes used as the object properties.

    GroupByExample_2018_11_15.slp (22.3 KB)

    • andre_mangatal's avatar
      andre_mangatal
      New Contributor

      Hi tStack

      Thank you very much my friend. Really appreciate the feedback.

      Before you posted this i used both the router and mapper as follows. Do you foresee any issues with this?

      Thank you

      • andre_mangatal's avatar
        andre_mangatal
        New Contributor

        Each of the two router mappers has the config below,

  • Many thanks for the feedback on this. I used it and works good.

    -Andre

    • andre_mangatal's avatar
      andre_mangatal
      New Contributor

      Hi tstack,

      In the group if I have a date field and I wanted to get the record with the latest date returned, how would I do that?

      match $group {
      […, { EndDate: “2019-8-31”, Value }, …] => Value
      }

      Many thanks

      • tstack's avatar
        tstack
        Former Employee

        Can you clarify a couple of things:

        • Are the date fields strings or Date objects?
        • Are the elements of $group already sorted by date?
  • Hi tstack

    Thank you for the reply. I need to get a solution for this. The date fields are strings and they are sorted descending before i did the group by via JDBC execute. A sample of the json just after the group by is shown below. I am grouping by “POL_ID” then “CVG_NUM” but I need to return ONLY the first record in each group i.e. with the latest date. How can I do this please??

    [
    {“groupBy”:{“CVG_NUM”:“01”,“POL_ID”:“DT90000051”},“group”:[{“CVG_NUM”:“01”,“POL_ID”:“DT90000051”,“CO_ID”:“01”,“FIA_EFF_DT”:“2010-04-30”}]},{“groupBy”:{“CVG_NUM”:“02”,“POL_ID”:“DT90000051”},“group”:[{“CVG_NUM”:“02”,“POL_ID”:“DT90000051”,“CO_ID”:“01”,“FIA_EFF_DT”:“2010-04-30”}]},{“groupBy”:{“CVG_NUM”:“01”,“POL_ID”:“DT90000051”}]}]

    • tstack's avatar
      tstack
      Former Employee

      If you just need the first element of the group, you can index it directly, like so:

      $group[0]
      

      If you’re wanting to deconstruct it with a match expression, you can match the first element by making a pattern like the following:

      match $group {
          [{ CO_ID, FIA_EFF_DT }, …] => CO_ID + " " + FIA_EFF_DT
      }
      

      When writing a pattern for match, write something that looks like the thing you’re trying to match and use ... as a wildcard. So, the above pattern will match an array with at least one object element that has the given properties.