Sort all elements in JSON Object

Hi all,

I’m having an issue where i’m utilizing the the builtin snaps to get details on a pipeline. I’m then pushing these pipeline details to Gitlab to enable version control.
Now my issue is that for some reason (i couldn’t find the culprit yet) the json gets slightly modified either by gitlab or possibly the “read pipeline” snap doesn’t always print the json content in the exact same order.
This results in different base64 encoding and therefore also different SHAs, which is a pure pain when trying to build a version control module.

So now i’m looking into the option of sorting all element in the json object beforehand to make sure that the content aligns. Does anyone have an idea on how to do that?

I’ve looked into two options:
(i) Expression language: {}.extend($.entries().sort((left, right) => left[0].localeCompare(right[0])))
The issue here is, that it will only sort the first level and ignore any sublevels.
(ii) Using a Javascript / Python script to sort the nodes, however there i run into the issue that i’m pretty unfamiliar with the Script snap and it’s a massive pain to use it. With Python it would be pretty easy, but i’ll need the json library, which i’d have to import (and i have no idea where to even begin investigating that).

Edit: Looks like i don’t need to add third party libraries, however i still cant get Python to work:

I can parse inDoc as dictionary, however json.dumps() just won’t work.

grafik

# Import the interface required by the Script snap.
from com.snaplogic.scripting.language import ScriptHook
import json

class TransformScript(ScriptHook):
    def __init__(self, input, output, error, log):
        self.input = input
        self.output = output
        self.error = error
        self.log = log

    # The "execute()" method is called once when the pipeline is started
    # and allowed to process its inputs or just send data to its outputs.
    def execute(self):
        self.log.info("Executing Transform script")
        while self.input.hasNext():
            try:
                # Read the next input document, store it in a new dictionary, and write this as an output document.
                inDoc = self.input.next()

                dictionary = dict(inDoc)
                test = json.dumps(dictionary, sort_keys=True)
                

                outDoc = {
                    'original' : dictionary
                }
                self.output.write(inDoc, outDoc)
            except Exception as e:
                errDoc = {
                    'error' : str(e)
                }
                self.log.error("Error in python script")
                self.error.write(errDoc)

        self.log.info("Script executed")

    # The "cleanup()" method is called after the snap has exited the execute() method
    def cleanup(self):
        self.log.info("Cleaning up")

# The Script Snap will look for a ScriptHook object in the "hook"
# variable.  The snap will then call the hook's "execute" method.
hook = TransformScript(input, output, error, log)

Any tips are appreciated, thanks!

Best regards
Thomas

I’m trying to solve the same problem, but for GitHub instead of GitLab. Did you ever resolve this?

I started with Python too and ran into a similar issue, but since this is Python 2 on Jython, it felt more prudent to stick with Javascript if I could. I’m very inexperienced in Javascript, but I was able to get a pretty-printed version of the JSON object, but only with the top-level keys sorted. I don’t know how to traverse the LinkedHashMap object that SnapLogic uses and convert it into a sorted TreeMap object.

// Ensure compatibility with both JDK 7 and 8 JSR-223 Script Engines
try { load("nashorn:mozilla_compat.js"); } catch(e) { }

// Import the interface required by the Script snap.
importPackage(com.snaplogic.scripting.language);

// Import the serializable Java type we'll use for the output data.
importClass(java.util.LinkedHashMap);
importClass(java.util.TreeMap);

importClass(com.google.gson.Gson);
importClass(com.google.gson.GsonBuilder);

/**
 * Create an object that implements the methods defined by the "ScriptHook"
 * interface.  We'll be passing this object to the constructor for the
 * ScriptHook interface.
 */
var impl = {
    /*
     * These variables (input, output, error, log) are defined by the
     * ExecuteScript snap when evaluating this script.
     */
    input : input,
    output : output,
    error : error,
    log : log,

    /**
     * The "execute()" method is called once when the pipeline is started
     * and allowed to process its inputs or just send data to its outputs.
     *
     * Exceptions are automatically caught and sent to the error view.
     */
    execute : function () {
       this.log.info("Executing Transform Script");
        while (this.input.hasNext()) {
            try {
                // Read the next input document, store it a new LinkedHashMap, and write this as an output document.
                // We must use a serializable Java type liked LinkedHashMap for each output instead of a native
                // JavaScript object so that downstream Snaps like Copy can process it correctly.
                var inDoc = this.input.next();
                var outDoc = new LinkedHashMap();



                var inDocTree = new TreeMap();
                inDocTree.putAll(inDoc)

                gson = new GsonBuilder().serializeNulls().setPrettyPrinting().create()
                var json_pretty = gson.toJson(inDocTree, TreeMap.class);

                outDoc.put("original", inDoc);
                outDoc.put('inDocTree', inDocTree);
                outDoc.put("json_pretty", json_pretty);
                this.output.write(inDoc, outDoc);
            }
            catch (err) {
                var errDoc = new LinkedHashMap();
                errDoc.put("error", err);
                this.log.error(err);
                this.error.write(errDoc);
            }
        }
        this.log.info("Script executed");
    },

    /**
     * The "cleanup()" method is called after the snap has exited the execute() method
     */
    cleanup : function () {
       this.log.info("Cleaning up")
    }
};

/**
 * The Script Snap will look for a ScriptHook object in the "hook"
 * variable.  The snap will then call the hook's "execute" method.
 */
var hook = new com.snaplogic.scripting.language.ScriptHook(impl);

Alright, I have something. I stuck with Javascript and built two recursive functions to convert the LinkedHashMap into a TreeMap (which sorts the object by key) before converting to JSON text.

// Ensure compatibility with both JDK 7 and 8 JSR-223 Script Engines
try { load("nashorn:mozilla_compat.js"); } catch(e) { }

// Import the interface required by the Script snap.
importPackage(com.snaplogic.scripting.language);

// Import the serializable Java type we'll use for the output data.
importClass(java.util.LinkedHashMap);
importClass(java.util.TreeMap);
importClass(java.util.ArrayList);

importClass(com.google.gson.Gson);
importClass(com.google.gson.GsonBuilder);

function putObjectInArray(object, array) {
    for (index in object) {
        if (Object.prototype.toString.call(object[index]) === '[object java.util.ArrayList]') {
            var nested_array = new ArrayList();
            putObjectInArray(object[index], nested_array);
            array.add(nested_array)
        } else if (Object.prototype.toString.call(object[index]) === '[object java.util.LinkedHashMap]') {
            var nested_tree = new TreeMap();
            putObjectInTree(object[index], nested_tree);
            array.add(nested_tree);
        } else {
            array.add(object[index]);
        }
    }
}

function putObjectInTree(object, tree) {
    for (property in object) {
        if (object[property] === null) {
            tree.put(property, null)
        } else if (Object.prototype.toString.call(object[property]) === '[object java.util.ArrayList]') {
            var tree_array = new ArrayList();
            var this_property = property;
            putObjectInArray(object[property], tree_array);
            tree.put(this_property, tree_array);
        } else if (Object.prototype.toString.call(object[property]) === '[object java.util.LinkedHashMap]') {
            var nested_tree = new TreeMap();
            var this_property = property;
            putObjectInTree(object[property], nested_tree);
            tree.put(this_property, nested_tree);
        } else {
            tree.put(property, object[property])
        }
    }
}

/**
 * Create an object that implements the methods defined by the "ScriptHook"
 * interface.  We'll be passing this object to the constructor for the
 * ScriptHook interface.
 */
var impl = {
    /*
     * These variables (input, output, error, log) are defined by the
     * ExecuteScript snap when evaluating this script.
     */
    input : input,
    output : output,
    error : error,
    log : log,

    /**
     * The "execute()" method is called once when the pipeline is started
     * and allowed to process its inputs or just send data to its outputs.
     *
     * Exceptions are automatically caught and sent to the error view.
     */
    execute : function () {
       this.log.info("Executing Transform Script");
        while (this.input.hasNext()) {
            try {
                // Read the next input document, store it a new LinkedHashMap, and write this as an output document.
                // We must use a serializable Java type liked LinkedHashMap for each output instead of a native
                // JavaScript object so that downstream Snaps like Copy can process it correctly.
                var inDoc = this.input.next();
                var outDoc = new LinkedHashMap();


                gson = new GsonBuilder().serializeNulls().setPrettyPrinting().create()

                var myTree = new TreeMap();
                putObjectInTree(inDoc, myTree);
                var json = gson.toJson(myTree, TreeMap.class);

                outDoc.put("original", inDoc);
                outDoc.put('myTree', myTree);
                outDoc.put("json_pretty", json);
                this.output.write(inDoc, outDoc);
            }
            catch (err) {
                var errDoc = new LinkedHashMap();
                errDoc.put("error", err);
                this.log.error(err);
                this.error.write(errDoc);
            }
        }
        this.log.info("Script executed");
    },

    /**
     * The "cleanup()" method is called after the snap has exited the execute() method
     */
    cleanup : function () {
       this.log.info("Cleaning up")
    }
};



/**
 * The Script Snap will look for a ScriptHook object in the "hook"
 * variable.  The snap will then call the hook's "execute" method.
 */
var hook = new com.snaplogic.scripting.language.ScriptHook(impl);

Sorry for the late reply, I was off for some time.

I ended up just dumping it to a sorted string and then parse the String, I’ve attached the pipeline, if you run into issues with your solution maybe you can fall back to this.

sortJSON_2021_07_19.slp (6.4 KB)

Best regards
Thomas