SnapLogic - Integration Nation

ramaonline · Tuesday

Overview

SnapLogic supports the deployment of Groundplexes on Kubernetes platforms, thus enabling the application to leverage the various capabilities of Kubernetes. This document explains a few best practice recommendations for the deployment of SnapLogic on Kubernetes along with a sample deployment example using GKE.

The examples in this document are specific to the GKE platform however the concepts can be applied to other Kubernetes platforms such as AWS and Azure.

Author:
Ram Bysani
SnapLogic Enterprise Architecture team

Helm Chart

A Helm chart is used to define the various deployment configurations for an application on Kubernetes. Additional information about Helm charts can be found here. The Helm chart package for a SnapLogic deployment can be downloaded from the Downloads section. It contains the following files:

Artifact	Comments
values.yaml	This file defines the default configuration for the SnapLogic Snaplex deployment. It includes variables like the number of JCC nodes, container image details, resource limits, and settings for Horizontal Pod Autoscaling (HPA). Reference: values.yaml
Chart.yaml	This file defines the metadata and version information for the Helm chart.
templates folder	This directory contains the Kubernetes manifest templates which define the resources to be deployed into the cluster. These templates are YAML files that specify Kubernetes resources with templating capabilities that allow for parameterization, flexibility, and reuse.
templates/deployment.yaml	This file defines a Kubernetes Deployment resource for managing the deployment of JCC instances in a cluster. The deployment is created only if the value of jccCount is greater than 0, as specified in the Helm chart's values.yaml file.
templates/deployment-feed.yaml	This file defines a Kubernetes Deployment resource for managing the deployment of Feedmaster instances. The deployment is conditionally created if the feedmasterCount value in the Helm chart's file values.yaml file is greater than 0.
templates/hpa.yaml	The hpa.yaml file defines a Horizontal Pod Autoscaler (HPA) resource for a Kubernetes application. The HPA automatically scales the number of pod replicas in a deployment or replica set based on observed metrics such as CPU utilization or custom metrics.
templates/service.yaml	The service.yaml file describes a Kubernetes service that exposes the JCC component of your Snaplex. It creates a LoadBalancer type service, which allows external access to the JCC components through a public IP address. The service targets only pods labeled as 'jcc' within the specified Snaplex and Helm release, ensuring proper communication and management.
templates/service-feed.yaml	The service-feed.yaml file describes a Kubernetes service that exposes the Feedmaster components. The service is only created if the value of feedmasterCount in the Helm chart’s values.xml file is > 0. It creates a LoadBalancer type service, which allows external access to the Feedmaster components through a public IP address.
templates/service-headless.yaml	The service-headless.yaml file describes a Kubernetes service for IPv6 communication. The service is only created if the value of enableIPv6 in the Helm chart’s values.xml file is set to true.

Table 1.0 Helm Chart configurations

Desired State vs Current State

The configurations in the various yaml files (e.g. Deployment, HPA, values, etc.) represent the “Desired” state of a Kubernetes deployment. The Kubernetes controllers constantly monitor the Current state of the deployment to bring it in alignment with the Desired state.

Horizontal Pod Autoscaling (HPA)

Horizontal Pod Autoscaling (HPA) is a feature in Kubernetes that automatically adjusts the number of replicas (pods) for your deployments based on resource metrics like CPU utilization and memory usage. SnapLogic supports HPA for deployments in a Kubernetes environment. The add-on Metrics server must be installed. Reference: Metrics-Server. Metrics collection is enabled by default in GKE as part of Cloud Monitoring.

Note that Custom Metrics and External Metrics, and Vertical Pod Autoscaling (VPA) are not supported for SnapLogic deployments on Kubernetes.

Groundplex deployment in a GKE environment - Example

In this section, we will go over the various steps for a SnapLogic Groundplex deployment in a GKE environment.

Groundplex creation

Create a new Groundplex from the Admin Manager interface. Reference: Snaplex_creation. The nodes for this Snaplex will be updated when the application is deployed to the GKE environment.

New Snaplex creation

GKE Cluster creation

Next, we create the GKE cluster on the Google Cloud console. We have created our cluster in Autopilot mode. In this mode, GKE manages the cluster and node configurations including scaling, load balancing, monitoring, metrics, and workload optimization. Reference: GKE Cluster

GKE cluster

Configure the SnapLogic platform Allowlist

Add the SnapLogic platform IP addresses to the Allowlist. See Platform Allowlist. In GKE, this is usually done by configuring an Egress Firewall rule on the GKE cluster. Please refer to the GKE documentation for additional details.

Firewall rule - Egress

Helm configurations

values.yaml

The below table explains the configurations for some of the sections from the values.yaml file which we have used in our set up. The modified files are attached to this article for reference. Reference: Helm chart configuration

Section

Comments

# Regular nodes count

jccCount: 3

# Feedmaster nodes count

feedmasterCount: 0

This defines the number of JCC pods.
We have enabled HPA for our test scenario, so the jccCount will be picked from the HPA section. (i.e. minReplicas and maxReplicas). The pod count is the number of pods across all nodes of the cluster.

No Feedmaster pods are configured in this example. Feedmaster count can be half of the JCC pod count. Feedmaster is used to distribute Ultra task requests to the JCC pods.
HPA configuration is only applicable to the JCC pods and not to the Feedmaster pods.

# Docker image of SnapLogic snaplex

image:

repository: snaplogic/snaplex
tag: latest

This specifies the latest and most recent release version of the repository image. You can specify a different tag if you need to update the version to a previous release for testing, etc.

# SnapLogic configuration link

snaplogic_config_link:

https://uat.elastic.snaplogic.com/api/1/rest/plex/config/
org/proj_space/shared/project

Retrieve the configuration link for the Snaplex by executing the Public API.
The config link string is the portion before ?expires in the output value of the API.
Example:
snaplogic_config_link:

https://uat.elastic.snaplogic.com/api/1/rest/plex/config/
QA/RB_Temp_Space/shared/RBGKE_node1

# SnapLogic Org admin credential

snaplogic_secret: secret/mysecret

Execute the kubectl command:

kubectl apply -f snapSecret.yaml
Please see the section To create the SnapLogic secret in this document: Org configurations.

# CPU and memory limits/requests for the nodes
limits:

memory: 8Gi
cpu: 2000m

requests:
memory: 8Gi
cpu: 2000m

Set requests and limits to the same values to ensure resource availability for the container processes.

Avoid running other processes in the same container as the JCC so that the JCC can have the maximum amount of memory.

# Default file ulimit and process ulimit

sl_file_ulimit: 8192

sl_process_ulimit: 4096

The value should be more than the # of slots configured for the node. (Maximum Slots under Node properties of the Snaplex).

If not set, then the node defaults will be used. (/etc/security/limits.conf). The JCC process is initialized with these values.

# JCC HPA

autoscaling:

enabled: true
minReplicas: 1
maxReplicas: 3

minReplicas defines the minimum number of Pods that must be running.

maxReplicas defines the maximum number of Pods that can be scheduled on the node(s).
The general guideline is to start with 1:2 or 1:3 Pods per node. The replica Pods are across all nodes of a deployment and not per node.

targetAvgCPUUtilization: 60

targetAvgMemoryUtilization: 60

To enable these metrics, the Kubernetes Metrics Server installation is required. Metrics collection is enabled by default in GKE as part of Cloud Monitoring.

targetAvgCPUUtilization:

Average CPU utilization percentage (i.e. 60 = 60%)

This is the average CPU utilization across all Pods. HPA will scale up or scale down Pods to maintain this average.

targetAvgMemoryUtilization:

Average memory utilization percentage.
This parameter is used to specify the average memory utilization (as a percentage of the requested memory) that the HPA should maintain across all the replicas of a particular deployment or stateful set.

scaleDownStabilizationWindowSeconds: 600

terminationGracePeriodSeconds: 900

# Enable IPv6 service for DNS routing to pods

enableIPv6: false

scaleDownStabilizationWindowSeconds is a parameter used in Kubernetes Horizontal Pod Autoscaler (HPA)

It controls the amount of time the HPA waits (like a cool-down period) before scaling down the number of pods after a decrease in resource utilization.

terminationGracePeriodSeconds defines the amount of time Kubernetes gives a pod to terminate before killing it. If the containers have not exited after terminationGracePeriodSeconds, then Kubernetes sends a SIGKILL signal to forcibly terminate the containers, and remove the pod from the cluster.

Table 2.0 - values.yaml

Load balancer configuration

The service.yaml file contains a section for the Load balancer configuration. Autopilot mode in GKE supports the creation of a Load balancer service.

Section

Comments

type: LoadBalancer

ports:
- port: 8081
protocol: TCP name: jcc

selector:

A Load balancer service will be created by GKE to route traffic to the application’s pods.

The external IP address and port details must be configured on the Settings tab of the Snaplex. An example is included in the next section of this document.

Table 3.0 service.yaml

Deployment using Helm

Upload the helm zip file package to the Cloud Shell instance by selecting the Upload option. The default Helm package for SnapLogic can be downloaded from here. It is recommended to download the latest package from the SnapLogic documentation link.
The values.yaml file with additional custom configurations (as described in Tables 2.0 / 3.0 above) is attached to this article.

Execute the command on the terminal to install and deploy the Snaplex release with a unique name such as snaplogic-snaplex using the configurations from the values.yaml file. The release name is a unique identifier, and can be different for multiple deployments such as Dev / Prod, etc.

helm install snaplogic-snaplex . -f values.yaml

<<Output>>
NAME: snaplogic-snaplex
NAMESPACE: default
STATUS: deployed
REVISION: 5
TEST SUITE: None
NOTES:

You can run this command to update an existing deployment with any new or updated Helm configurations.

helm upgrade snaplogic-snaplex . -f values.yaml

View the deployed application under the Workloads tab on the Google Cloud Console.

Workloads

This command returns the HPA details.

$ kubectl describe hpa

Name: snaplogic-snaplex-hpa

Namespace: default
Labels: app.kubernetes.io/instance=snaplogic-snaplex
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=snaplogic-snaplex
app.kubernetes.io/version=1.0
helm.sh/chart=snaplogic-snaplex-0.2.0
Annotations: meta.helm.sh/release-name: snaplogic-snaplex
meta.helm.sh/release-namespace: default
Deployment/snaplogic-snaplex-jcc

Metrics: ( current / target ) resource cpu on pods (as a percentage of request): 8% (153m) / 60%
resource memory on pods (as a percentage of request): 28% (1243540138666m) / 60%

Min replicas: 1
Max replicas: 3

Run the kubectl command to list the services. You can see the external IP addresses for the Load balancer service.

kubectl get services

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

kubernetes ClusterIP 34.118.224.1 <none> 443/TCP 16d

kubectl get services

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

kubernetes ClusterIP 34.118.224.1 <none> 443/TCP 16d
snaplogic-snaplex-regular LoadBalancer 34.118.227.164 34.45.230.213 8081:32526/TCP 25m

Update Load balancer url on the Snaplex

Note the external IP address for the LoadBalancer service, and update the host and port on the Load balancer field of the Snaplex. Example: http://1.3.4.5:8081

Load balancer

Listing pods in GKE

The following commands can be executed to view the pod statuses. The pod creation and maintenance is fully managed by GKE.

$ kubectl top pods
$ kubectl get pods

kubectl get pods --field-selector=status.phase=Running

NAME READY STATUS RESTARTS AGE

snaplogic-snaplex-jcc-687d87994-crzw9 0/1 Running 0 2m
snaplogic-snaplex-jcc-687d87994-kks7l 1/1 Running 0 2m38s
snaplogic-snaplex-jcc-687d87994-pcfvp 1/1 Running 0 2m24s

View node details in the SnapLogic Monitor application

Each pod represents a JCC node. The maxReplica value is set to 3 so you would see a maximum of 3 nodes (pods) deployed. (Analyze -> Infrastructure tab).

Snaplex nodes

The below command uninstalls and deletes the deployment from the cluster. All deployed services, metadata, and associated resources are also removed.

helm uninstall <deployment_name>

Pod registration with the SnapLogic Control Plane

Scenario	Comments
How are the Pod neighbors resolved and maintained by the SnapLogic Control Plane?	When a JCC/FeedMaster node (*Pod*) starts, it registers with the SnapLogic Control Plane, and the Control Plane maintains the list of Pod neighbors. When a JCC/FeedMaster node (*Pod*) registers, it also publishes its IP address to the Control Plane. An internal list of Pod IP addresses is updated dynamically for neighbor to neighbor communication. DNS resolution is not used.
How are the container repository versions updated?	The latest Snaplex release build is updated in the docker repository version tagged ‘latest’. The pods will be deployed with this version on startup by referencing the tags from the values.yaml file. If the Snaplex version is updated on the Control Plane to a different version (e.g. main-2872), then the JCC nodes (pods) will be updated to match that version (i.e. main-2872).

Reference

Groundplex Deployment on Kubernetes
https://kubernetes.io/
GKE
HPA