Parsing XML Data and formatting it
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎08-01-2021 11:29 PM
Hi Team,
We have a requirement to parse the xml file (1.5GB) and transform/group the content based on the one of the field value and write write multiple files based on the each group.
=== Sample Input ==
<?xml version="1.0" encoding="UTF-8" ?>test
test
Test1 Test2 Test1===== output File 1===
<?xml version="1.0" encoding="UTF-8" ?>test
test
Test1 Test1===== output File 2===
<?xml version="1.0" encoding="UTF-8" ?>test
test
Test2I have tried using the xml parser , split based on child and add headers back. Problem here is as it is huge data CPU and memory are going high and getting Connection lost error.
Have also tried xslt but still got same issue.
Can you please help me to design the solution with memory optimization.
Thanks in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎08-05-2021 11:48 PM
Team,
Can you please help and let me know if any details required.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎08-09-2021 08:47 PM
can you please guide to resolve this issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎08-10-2021 05:02 AM
@acmohan023 - Thank you for providing the sample input and output. Since the post is about CPU and memory issues, could you please share a sanitized version of your pipelines as well. By sanitize, I mean removing values from any field that reveal SFTP hostnames, IP addresses, accounts, or other sensitive details - it’s just the overall logic I would like to see.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎08-23-2021 04:10 AM
Hi @rsramkoski - Sorry for the delayed response. I have tried making the replica of pipeline used. Please find the attached pipeline for the scenario mentioned.
FileReadAndProcess_2021_08_23.slp (20.0 KB)
ProcessGroup_2021_08_23.slp (11.9 KB)
WriteGroupedData_2021_08_23.slp (5.1 KB)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎08-31-2021 09:53 PM
Team,
Can you please help on this.
