SnapLogic deployment on Kubernetes - A reference guide
Overview SnapLogic supports the deployment of Groundplexes on Kubernetes platforms, thus enabling the application to leverage the various capabilities of Kubernetes. This document explains a few best practice recommendations for the deployment of SnapLogic on Kubernetes along with a sample deployment example using GKE. The examples in this document are specific to the GKE platform however the concepts can be applied to other Kubernetes platforms such as AWS and Azure. Author: Ram Bysani SnapLogic Enterprise Architecture team Helm Chart A Helm chart is used to define the various deployment configurations for an application on Kubernetes. Additional information about Helm charts can be found here. The Helm chart package for a SnapLogic deployment can be downloaded from the Downloads section. It contains the following files: Artifact Comments values.yaml This file defines the default configuration for the SnapLogic Snaplex deployment. It includes variables like the number of JCC nodes, container image details, resource limits, and settings for Horizontal Pod Autoscaling (HPA). Reference: values.yaml Chart.yaml This file defines the metadata and version information for the Helm chart. templates folder This directory contains the Kubernetes manifest templates which define the resources to be deployed into the cluster. These templates are YAML files that specify Kubernetes resources with templating capabilities that allow for parameterization, flexibility, and reuse. templates/deployment.yaml This file defines a Kubernetes Deployment resource for managing the deployment of JCC instances in a cluster. The deployment is created only if the value of jccCount is greater than 0, as specified in the Helm chart's values.yaml file. templates/deployment-feed.yaml This file defines a Kubernetes Deployment resource for managing the deployment of Feedmaster instances. The deployment is conditionally created if the feedmasterCount value in the Helm chart's file values.yaml file is greater than 0. templates/hpa.yaml The hpa.yaml file defines a Horizontal Pod Autoscaler (HPA) resource for a Kubernetes application. The HPA automatically scales the number of pod replicas in a deployment or replica set based on observed metrics such as CPU utilization or custom metrics. templates/service.yaml The service.yaml file describes a Kubernetes service that exposes the JCC component of your Snaplex. It creates a LoadBalancer type service, which allows external access to the JCC components through a public IP address. The service targets only pods labeled as 'jcc' within the specified Snaplex and Helm release, ensuring proper communication and management. templates/service-feed.yaml The service-feed.yaml file describes a Kubernetes service that exposes the Feedmaster components. The service is only created if the value of feedmasterCount in the Helm chart’s values.xml file is > 0. It creates a LoadBalancer type service, which allows external access to the Feedmaster components through a public IP address. templates/service-headless.yaml The service-headless.yaml file describes a Kubernetes service for IPv6 communication. The service is only created if the value of enableIPv6 in the Helm chart’s values.xml file is set to true. Table 1.0 Helm Chart configurations Desired State vs Current State The configurations in the various yaml files (e.g. Deployment, HPA, values, etc.) represent the “Desired” state of a Kubernetes deployment. The Kubernetes controllers constantly monitor the Current state of the deployment to bring it in alignment with the Desired state. Horizontal Pod Autoscaling (HPA) Horizontal Pod Autoscaling (HPA) is a feature in Kubernetes that automatically adjusts the number of replicas (pods) for your deployments based on resource metrics like CPU utilization and memory usage. SnapLogic supports HPA for deployments in a Kubernetes environment. The add-on Metrics server must be installed. Reference: Metrics-Server. Metrics collection is enabled by default in GKE as part of Cloud Monitoring. Note that Custom Metrics and External Metrics, and Vertical Pod Autoscaling (VPA) are not supported for SnapLogic deployments on Kubernetes. Groundplex deployment in a GKE environment - Example In this section, we will go over the various steps for a SnapLogic Groundplex deployment in a GKE environment. Groundplex creation Create a new Groundplex from the Admin Manager interface. Reference: Snaplex_creation. The nodes for this Snaplex will be updated when the application is deployed to the GKE environment. New Snaplex creation GKE Cluster creation Next, we create the GKE cluster on the Google Cloud console. We have created our cluster in Autopilot mode. In this mode, GKE manages the cluster and node configurations including scaling, load balancing, monitoring, metrics, and workload optimization. Reference: GKE Cluster GKE cluster Configure the SnapLogic platform Allowlist Add the SnapLogic platform IP addresses to the Allowlist. See Platform Allowlist. In GKE, this is usually done by configuring an Egress Firewall rule on the GKE cluster. Please refer to the GKE documentation for additional details. Firewall rule - Egress Helm configurations values.yaml The below table explains the configurations for some of the sections from the values.yaml file which we have used in our set up. The modified files are attached to this article for reference. Reference: Helm chart configuration Section Comments # Regular nodes count jccCount: 3 # Feedmaster nodes count feedmasterCount: 0 This defines the number of JCC pods. We have enabled HPA for our test scenario, so the jccCount will be picked from the HPA section. (i.e. minReplicas and maxReplicas). The pod count is the number of pods across all nodes of the cluster. No Feedmaster pods are configured in this example. Feedmaster count can be half of the JCC pod count. Feedmaster is used to distribute Ultra task requests to the JCC pods. HPA configuration is only applicable to the JCC pods and not to the Feedmaster pods. # Docker image of SnapLogic snaplex image: repository: snaplogic/snaplex tag: latest This specifies the latest and most recent release version of the repository image. You can specify a different tag if you need to update the version to a previous release for testing, etc. # SnapLogic configuration link snaplogic_config_link: https://uat.elastic.snaplogic.com/api/1/rest/plex/config/ org/proj_space/shared/project Retrieve the configuration link for the Snaplex by executing the Public API. The config link string is the portion before ?expires in the output value of the API. Example: snaplogic_config_link: https://uat.elastic.snaplogic.com/api/1/rest/plex/config/ QA/RB_Temp_Space/shared/RBGKE_node1 # SnapLogic Org admin credential snaplogic_secret: secret/mysecret Execute the kubectl command: kubectl apply -f snapSecret.yaml Please see the section To create the SnapLogic secret in this document: Org configurations. # CPU and memory limits/requests for the nodes limits: memory: 8Gi cpu: 2000m requests: memory: 8Gi cpu: 2000m Set requests and limits to the same values to ensure resource availability for the container processes. Avoid running other processes in the same container as the JCC so that the JCC can have the maximum amount of memory. # Default file ulimit and process ulimit sl_file_ulimit: 8192 sl_process_ulimit: 4096 The value should be more than the # of slots configured for the node. (Maximum Slots under Node properties of the Snaplex). If not set, then the node defaults will be used. (/etc/security/limits.conf). The JCC process is initialized with these values. # JCC HPA autoscaling: enabled: true minReplicas: 1 maxReplicas: 3 minReplicas defines the minimum number of Pods that must be running. maxReplicas defines the maximum number of Pods that can be scheduled on the node(s). The general guideline is to start with 1:2 or 1:3 Pods per node. The replica Pods are across all nodes of a deployment and not per node. targetAvgCPUUtilization: 60 targetAvgMemoryUtilization: 60 To enable these metrics, the Kubernetes Metrics Server installation is required. Metrics collection is enabled by default in GKE as part of Cloud Monitoring. targetAvgCPUUtilization: Average CPU utilization percentage (i.e. 60 = 60%) This is the average CPU utilization across all Pods. HPA will scale up or scale down Pods to maintain this average. targetAvgMemoryUtilization: Average memory utilization percentage. This parameter is used to specify the average memory utilization (as a percentage of the requested memory) that the HPA should maintain across all the replicas of a particular deployment or stateful set. scaleDownStabilizationWindowSeconds: 600 terminationGracePeriodSeconds: 900 # Enable IPv6 service for DNS routing to pods enableIPv6: false scaleDownStabilizationWindowSeconds is a parameter used in Kubernetes Horizontal Pod Autoscaler (HPA) It controls the amount of time the HPA waits (like a cool-down period) before scaling down the number of pods after a decrease in resource utilization. terminationGracePeriodSeconds defines the amount of time Kubernetes gives a pod to terminate before killing it. If the containers have not exited after terminationGracePeriodSeconds, then Kubernetes sends a SIGKILL signal to forcibly terminate the containers, and remove the pod from the cluster. Table 2.0 - values.yaml Load balancer configuration The service.yaml file contains a section for the Load balancer configuration. Autopilot mode in GKE supports the creation of a Load balancer service. Section Comments type: LoadBalancer ports: - port: 8081 protocol: TCP name: jcc selector: A Load balancer service will be created by GKE to route traffic to the application’s pods. The external IP address and port details must be configured on the Settings tab of the Snaplex. An example is included in the next section of this document. Table 3.0 service.yaml Deployment using Helm Upload the helm zip file package to the Cloud Shell instance by selecting the Upload option. The default Helm package for SnapLogic can be downloaded from here. It is recommended to download the latest package from the SnapLogic documentation link. The values.yaml file with additional custom configurations (as described in Tables 2.0 / 3.0 above) is attached to this article. Execute the command on the terminal to install and deploy the Snaplex release with a unique name such as snaplogic-snaplex using the configurations from the values.yaml file. The release name is a unique identifier, and can be different for multiple deployments such as Dev / Prod, etc. helm install snaplogic-snaplex . -f values.yaml <<Output>> NAME: snaplogic-snaplex NAMESPACE: default STATUS: deployed REVISION: 5 TEST SUITE: None NOTES: You can run this command to update an existing deployment with any new or updated Helm configurations. helm upgrade snaplogic-snaplex . -f values.yaml View the deployed application under the Workloads tab on the Google Cloud Console. Workloads This command returns the HPA details. $ kubectl describe hpa Name: snaplogic-snaplex-hpa Namespace: default Labels: app.kubernetes.io/instance=snaplogic-snaplex app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=snaplogic-snaplex app.kubernetes.io/version=1.0 helm.sh/chart=snaplogic-snaplex-0.2.0 Annotations: meta.helm.sh/release-name: snaplogic-snaplex meta.helm.sh/release-namespace: default Deployment/snaplogic-snaplex-jcc Metrics: ( current / target ) resource cpu on pods (as a percentage of request): 8% (153m) / 60% resource memory on pods (as a percentage of request): 28% (1243540138666m) / 60% Min replicas: 1 Max replicas: 3 Run the kubectl command to list the services. You can see the external IP addresses for the Load balancer service. kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 34.118.224.1 <none> 443/TCP 16d kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 34.118.224.1 <none> 443/TCP 16d snaplogic-snaplex-regular LoadBalancer 34.118.227.164 34.45.230.213 8081:32526/TCP 25m Update Load balancer url on the Snaplex Note the external IP address for the LoadBalancer service, and update the host and port on the Load balancer field of the Snaplex. Example: http://1.3.4.5:8081 Load balancer Listing pods in GKE The following commands can be executed to view the pod statuses. The pod creation and maintenance is fully managed by GKE. $ kubectl top pods $ kubectl get pods kubectl get pods --field-selector=status.phase=Running NAME READY STATUS RESTARTS AGE snaplogic-snaplex-jcc-687d87994-crzw9 0/1 Running 0 2m snaplogic-snaplex-jcc-687d87994-kks7l 1/1 Running 0 2m38s snaplogic-snaplex-jcc-687d87994-pcfvp 1/1 Running 0 2m24s View node details in the SnapLogic Monitor application Each pod represents a JCC node. The maxReplica value is set to 3 so you would see a maximum of 3 nodes (pods) deployed. (Analyze -> Infrastructure tab). Snaplex nodes The below command uninstalls and deletes the deployment from the cluster. All deployed services, metadata, and associated resources are also removed. helm uninstall <deployment_name> Pod registration with the SnapLogic Control Plane Scenario Comments How are the Pod neighbors resolved and maintained by the SnapLogic Control Plane? When a JCC/FeedMaster node (Pod) starts, it registers with the SnapLogic Control Plane, and the Control Plane maintains the list of Pod neighbors. When a JCC/FeedMaster node (Pod) registers, it also publishes its IP address to the Control Plane. An internal list of Pod IP addresses is updated dynamically for neighbor to neighbor communication. DNS resolution is not used. How are the container repository versions updated? The latest Snaplex release build is updated in the docker repository version tagged ‘latest’. The pods will be deployed with this version on startup by referencing the tags from the values.yaml file. If the Snaplex version is updated on the Control Plane to a different version (e.g. main-2872), then the JCC nodes (pods) will be updated to match that version (i.e. main-2872). Reference Groundplex Deployment on Kubernetes https://kubernetes.io/ GKE HPA1.6KViews2likes0Comments