Configuring the Script Snap to use a configured HTTP proxy environment variable

Question

HTTP-compatible Snap Packs can leverage an HTTP proxy configured in the Snaplex’s Network Proxies configuration tab within the SnapLogic Manager web application.
However, the Script Snap is different because you can write Scripts to call external processes (e.g. curl) and these will not be aware of any proxy configuration set within the SnapLogic application.
curl can be configured to use a proxy directly via the --proxy argument, but if you wished to enforce that proxy usage across all usages of the Script Snap, you can set the http_proxy and/or https_proxy environment variables within a special file - /etc/sysconfig/jcc.
Environment variables declared within this file will be visible to the Snaplex application (OS-level env vars will not be).
This file (and directory) may not exist in your Snaplex, so you may have to create them (similar to the instructions on the Configuring a custom JRE version page):

sudo mkdir -p /etc/sysconfig; sudo sh -c  "echo 'export http_proxy=username:password@proxy-ip-address:port' &gt;&gt; /etc/sysconfig/jcc"

substituting the equivalent values for username/password (if authentication is required), proxy-ip-address, and port (you may also want to add https_proxy too).
Once this file is created, restart the Snaplex application (/opt/snaplogic/bin/jcc.sh restart or c:\opt\snaplogic\bin\jcc.bat restart) and the http_proxy/https_proxy environment variable will now be active within the SnapLogic product.
Assuming your proxy is correctly configured, you can then run your Script that calls the external process and, if the process supports using a proxy, it will respect the setting.
For example, the following Script Snap (Python) uses the subprocess library to execute curl and adds the response body to the output document.
# Import the interface required by the Script snap.
from com.snaplogic.scripting.language import ScriptHook
import subprocess

class TransformScript(ScriptHook):
    def __init__(self, input, output, error, log):
        self.input = input
        self.output = output
        self.error = error
        self.log = log

# The "execute()" method is called once when the pipeline is started
    # and allowed to process its inputs or just send data to its outputs.
    def execute(self):
        self.log.info("Executing Transform script")
        while self.input.hasNext():
            try:
                # Read the next input document, store it in a new dictionary, and write this as an output document.
                inDoc = self.input.next()
                proc = subprocess.Popen(['curl','https://www.snaplogic.com'], stdout=subprocess.PIPE)
                (out, err) = proc.communicate()
                outDoc = {
                    'original' : out
                }
                self.output.write(inDoc, outDoc)
            except Exception as e:
                errDoc = {
                    'error' : str(e)
                }
                self.log.error("Error in python script")
                self.error.write(errDoc)

self.log.info("Script executed")

# The "cleanup()" method is called after the snap has exited the execute() method
    def cleanup(self):
        self.log.info("Cleaning up")

# The Script Snap will look for a ScriptHook object in the "hook"
# variable.  The snap will then call the hook's "execute" method.
hook = TransformScript(input, output, error, log)

On execution, the proxy access log should show the request being routed through the proxy.

smudassir · Answer

Hi @rashmi ,
The execute snap may be able to run all the create statements followed by the select statement.  However, it gives the output based on the first query.  In your case, its a create statement so it gives the output with success message.
If you have all the create statements first followed by a select statement at the end, then I suggest you to try one of these two approaches:
- Approach1:
a) Use two snaps.  The first snap has to be the Execute snap.  In this first snap, put only the create statements.
b) The second snap can be an Execute snap with the select statement, or the Select snap itself.
- Approach2:
a) Use a multi-execute snap followed by a Execute/Select snap.
b) Put all the create statements in the multi execute snap.
c) And, use Execute snap with select, or the Select snap itself.
Let me know if this helped.

koryknick · Answer

Here is an example pipeline that performs as I described above.  It can create the temporary tables, then query from them. Note that you may want to drop the tables as I have done in my example in case they are still present in your session due to connection pooling.
Redshift script execute_2023_03_20.slp (8.8 KB)
I hope this helps!

koryknick · Answer

I have not tested with Redshift, but I know it works for other databases.  You could try to use a Redshift account with Auto Commit disabled and pass the statements into a Redshift Execute snap one at a time.  You will need a Router or Filter after the Execute to ignore the “success” records from the Create Table commands vs. the Select output.
If that doesn’t work, can you use CTE (with clause) to sub-query your data rather than build temporary tables?

rashmi · Answer

Thanks a lot for giving your time to help @smudassir , @koryknick  will try both and see

Forum Discussion

Configuring the Script Snap to use a configured HTTP proxy environment variable

2 Replies

Recent Discussions

Javascript to promote top level lists

Google Sheets Subscribe questions

Basic string transformations not working

Can we generate XML file in pretty print format using native snapLogic snaps?

Multipart Reader failure - 'content-type' was not found