06-23-2017 04:09 PM
Please can some shed some light on using a third party library (boto3, requests) in a python script
Thank you in advance!
06-27-2017 07:28 AM
AFAIK you can call 3rd party java libs inside script snap using py as scripting language, here is a sample script that uses aws java sdk, gets a list of s3 objects from a bucket
# Script begin
# Import the interface required by the Script snap.
import java.util
import json
import sys
sys.path.append('/opt/snaplogic/userlibs/aws-java-sdk-1.9.6.jar')
from com.amazonaws import AmazonClientException
from com.amazonaws import AmazonServiceException
from com.amazonaws.regions import Region
from com.amazonaws.regions import Regions
from com.amazonaws.services.s3 import AmazonS3
from com.amazonaws.services.s3 import AmazonS3Client
from com.amazonaws.services.s3.model import Bucket
from com.amazonaws.services.s3.model import GetObjectRequest
from com.amazonaws.services.s3.model import ListObjectsRequest
from com.amazonaws.services.s3.model import ObjectListing
from com.amazonaws.services.s3.model import PutObjectRequest
from com.amazonaws.services.s3.model import S3Object
from com.amazonaws.services.s3.model import S3ObjectSummary
from com.snaplogic.scripting.language import ScriptHook
class TransformScript(ScriptHook):
def __init__(self, input, output, error, log):
self.input = input
self.output = output
self.error = error
self.log = log
# The "execute()" method is called once when the pipeline is started
# and allowed to process its inputs or just send data to its outputs.
def execute(self):
self.log.info("Executing Transform script")
while self.input.hasNext():
try:
# Read the next document, wrap it in a map and write out the wrapper
in_doc = self.input.next()
# wrapper = java.util.HashMap()
# bucket is a property set in a mapper that precedes script snap and holds the bucket
# name
bucket = in_doc.get("bucket")
# $_bucketPipelineParam is a pipeline param and holds the bucket name
bucketParam = $_bucketPipelineParam
s3 = AmazonS3Client()
usEast1 = Region.getRegion(Regions.US_EAST_1)
s3.setRegion(usEast1)
objectListing = s3.listObjects(ListObjectsRequest().withBucketName(bucket))
s3objectList = {}
for objectSummary in objectListing.getObjectSummaries():
wrapper.put(objectSummary.getKey(),objectSummary.getSize())
self.output.write(in_doc, wrapper)
except Exception as e:
errWrapper = {
'errMsg' : str(e.args)
}
self.log.error("Error in python script")
self.error.write(errWrapper)
self.log.info("Finished executing the Transform script")
# The Script Snap will look for a ScriptHook object in the "hook"
# variable. The snap will then call the hook's "execute" method.
hook = TransformScript(input, output, error, log)
# Script end
I am on windows so I’ve copied aws-java-sdk-1.9.6.jar file to c:/opt/snaplogic/userlibs on all of the plex nodes, you need to copy 3rd party jars to all of the plex nodes and make sure to save it at a consitent, same path on all nodes.
Also on my nodes I have edited/created a credentials file located here C:\Users\Bkukadia\.aws\credentials which contains these key=value pairs
aws_access_key_id=AWSKEYAKIAIGFUBXI
aws_secret_access_key=AWSSECRET8++Kg6QNMX6
I think you can also use IAM roles but I am not much familiar with it.
For more details on aws java sdk check this out - AWS SDK for Java
12-12-2017 08:21 PM
This is an excellent post cause we are running into similar problem with integrating to different AWS services such as publishing to SNS topic and we are trying to install Boto3 on the JCC node so hopefully the python script will pick it up. Apparently from this post installing python Boto3 won’t work and we need to and download AWS JAVA SDK instead and invoke it from Python script. This is a good solution to us but we need to know the java import libraries for SNS.
Can someone help us by posting a similar code snippet but for publishing to SNS instead of S3 using the JAVA sdk from Python? What are the import libraries we need?
12-12-2017 08:26 PM
Following up on my prevoius question, I guess I found the answer on this link:
http://docs.aws.amazon.com/sns/latest/dg/using-awssdkjava.html
But feel free to add to it or provide a Python code snippet for SNS if I missed anything…
12-12-2017 09:15 PM
# Import the interface required by the Script snap.
from com.snaplogic.scripting.language import ScriptHook
import java.util
import urlparse
import urllib2
import sys
import socket
import json#referencing aws java lib
sys.path.append(‘/opt/snaplogic/userlibs/aws-java-sdk-1.11.156.jar’)
sys.path.append(‘/opt/snaplogic/userlibs/commons-logging-1.1.3.jar’)
sys.path.append(‘/opt/snaplogic/userlibs/jackson-databind-2.6.6.jar’)
sys.path.append(‘/opt/snaplogic/userlibs/jackson-core-2.6.6.jar’)
sys.path.append(‘/opt/snaplogic/userlibs/jackson-annotations-2.6.0.jar’)
sys.path.append(‘/opt/snaplogic/userlibs/httpcore-4.4.4.jar’)
sys.path.append(‘/opt/snaplogic/userlibs/httpclient-4.5.2.jar’)
sys.path.append(‘/opt/snaplogic/userlibs/joda-time-2.8.1.jar’)#importing core/aws java lib
from java.io import ByteArrayInputStream
from java.lang import Exception
from com.amazonaws.auth import BasicAWSCredentials
from com.amazonaws.services.s3 import AmazonS3Client
from com.amazonaws.services.s3.model import PutObjectRequest
from com.amazonaws.services.s3.model import ObjectMetadata
from com.snaplogic.scripting.language import ScriptHookclass TransformScript(ScriptHook):
def init(self, input, output, error, log):
self.input = input
self.output = output
self.error = error
self.log = log# The "execute()" method is called once when the pipeline is started # and allowed to process its inputs or just send data to its outputs. def execute(self): self.log.info("Executing Transform script") while self.input.hasNext(): try: # Read the next document, wrap it in a map and write out the wrapper in_doc = self.input.next() try: #set "socket" time as 5 sec globally for s3 to read socket.setdefaulttimeout(5) #get object values from incoming json document bucket_name = $_bucket photo_id = in_doc.get("photo_id") listing_id = in_doc.get("listing_id") photo_url = in_doc.get("photo_url") #"parse" url for photo_name and set location for file to be saved on s3 parsed_url = urlparse.urlsplit(photo_url) path = parsed_url[2].split("/") photo_name = path[len(path) - 1] s3_location = $_subdir + $_pathdelimiter + listing_id + $_pathdelimiter + photo_name #creating s3 client #access photo content using "photo_url" url_response = urllib2.urlopen(photo_url, timeout=5) content_length = long(url_response.headers['Content-Length']) content_type = url_response.headers['Content-Type'] obj_metadata = ObjectMetadata() obj_metadata.setContentLength(content_length) obj_metadata.setContentType(content_type) data = url_response.read() bais = ByteArrayInputStream(data) bais.reset() try: basic_cred = BasicAWSCredentials("access_token","secret_token") s3 = AmazonS3Client(basic_cred) s3.putObject(PutObjectRequest(bucket_name, s3_location, bais, obj_metadata)) bais.close() wrapper = { "photo_id" : photo_id, "listing_id" : listing_id, "photo_url" : photo_url, "photo_name" : photo_name, "photo_downloaded" : True, "error" : None } except Exception as s3_error: wrapper = { "photo_id" : photo_id, "listing_id" : listing_id, "photo_url" : photo_url, "photo_name" : photo_name, "photo_downloaded" : False, "error" : "s3_error, " + str(s3_error) } except BaseException as photo_url_error: wrapper = { "photo_id" : photo_id, "listing_id" : listing_id, "photo_url" : photo_url, "photo_name" : photo_name, "photo_downloaded" : False, "error" : "photo_url_error, " + str(photo_url_error) } #output self.output.write(in_doc, wrapper) #error view output when error view is enabled except Exception as e: errWrapper = { 'errMsg' : str(e.args) } self.log.error("Error in python script") self.error.write(errWrapper) self.log.info("Finished executing the Transform script")
# The Script Snap will look for a ScriptHook object in the “hook”
#variable. The snap will then call the hook’s “execute” method.
hook = TransformScript(input, output, error, log)
this is my very first snaplogic jython script and works well.
Couple of things to keep in mind,
you can write a java code and convert it to jython if it gets too complicated.
link to jython doc: http://www.jython.org/docs/library/indexprogress.html
there was a time when I tried to added more complexity to this code and I realized jython cannot perform and/or handle what I intend to do.
So my new approach was to use unix snap and run my actual python script. Yes actual python script
Hopefully some day snaplogic will add native Java and Python support.