08-30-2024 11:58 AM
Hi,
I am building a pipeline to select latest datetime from the data file. I also need the latest value for all source_code (essentially a groupby and then max value of within group). I tried Aggregate function to select MAX over datetime and in the GROUPBY field I added source_code but do not receive the intendent result.
Is there other way ? I have sorted the columns I need to find MAX value.
09-03-2024 05:52 AM
@pranjbhatt - can you provide a small sample of input data and desired output to clarify your question?
09-03-2024 07:12 AM
Hi,
Following is the sample data. I need to have latest last_date values for each of the source (source_code) and each of the target table. Note that date is in dd-MM-yyyy format.
INPUT | |||||
TARGET TABLE | SOURCE CODE | LAST DATE | RUN DATE | LOAD STATUS | TYPE |
DIM_CLIENT | concentrix | 03-09-2024 | 03-09-2024 | Now | Positive |
DIM_CLIENT | concentrix | 30-08-2024 | 03-09-2024 | Now | Positive |
DIM_CLIENT | envizi | 01-09-2024 | 03-09-2024 | Now | Positive |
DIM_CLIENT | envizi | 27-08-2024 | 03-09-2024 | Now | Positive |
USER | envizi | 28-08-2024 | 03-09-2024 | Now | Positive |
USER | concentrix | 02-09-2024 | 03-09-2024 | Now | Positive |
USER | concentrix | 29-08-2024 | 03-09-2024 | Now | Positive |
USER | envizi | 27-08-2024 | 03-09-2024 | Now | Positive |
OUTPUT | |||||
TARGET TABLE | SOURCE CODE | LAST DATE | RUN DATE | LOAD STATUS | TYPE |
DIM_CLIENT | concentrix | 03-09-2024 | 03-09-2024 | Now | Positive |
DIM_CLIENT | envizi | 01-09-2024 | 03-09-2024 | Now | Positive |
USER | envizi | 28-08-2024 | 03-09-2024 | Now | Positive |
USER | concentrix | 02-09-2024 | 03-09-2024 | Now | Positive |