Logging

1 Topic

Platform Administration Reference guide v3
Introduction This document is a reference manual for common administrative and management tasks on the SnapLogic platform. It has been revised to include the new Admin Manager and Monitor functionality, which replace the Classic Manager and Dashboard interfaces respectively. This document is for SnapLogic Environment Administrators (Org Administrators) and users involved in supporting or managing the platform components. Author: Ram Bysani SnapLogic Enterprise Architecture team Environment Administrator (known as Org Admin in the Classic Manager) permissions There are two reserved groups in SnapLogic: admins: Users in this group have full access to all projects in the Org. members: Users in this group have access to projects that they create, or to which they are granted access. Users are automatically added to this group when you create them, and they must be a part of the members group to have any privileges within that Org. There are two user roles: Environment admins: Org users who can manage the Org. Environment admins are part of the admins group, and this role is named “Org Admin” in the classic Manager. Basic user: All non-admin users. Within an Org, basic users can create projects and work with assets in the Project spaces to which they have been granted permission. To gain Org administrator privileges, a Basic user can be added to the admins group. The below table lists the various tasks under the different categories that an Environment admin user can perform: Task Comments USER MANAGEMENT Create and delete users. Update user profiles. Create and delete groups. Add users to a group. Configure password expiration policies. Enable users’ access to applications (AutoSync, IIP) When a user is removed from an Org, the administrator that removes the user becomes the owner of that user's assets. Reference: User Management MANAGER Create and manage Project Spaces. Update permissions (R, W, X) on an individual Project space and projects. Delete a Project space. Restore Project spaces, projects, and assets from the Recycle bin. Permanently delete Project spaces, projects, and assets from the Recycle bin. Configure Git integration and integration with tools such as Azure Repos, GitLab, and GHES. View Account Statistics, and generate reports for accounts, projects, and pipelines within the project that use an account. Upgrade/downgrade Snap Pack versions. ALERTS and NOTIFICATIONS Set up alerts and notifications. Set up Slack channels and recipients for notifications. Reference: Alerts SNAPLEX and ORG Create Groundplexes. Manage Snaplex versions. Update Snaplex settings. Update or revert a Snaplex version. APIM Publish, unpublish, and deprecate APIs on the Developer portal. Configure the Developer portal. Approve API subscriptions and manage/approve user accounts. Reference: API Management AutoSync Configure AutoSync user permissions. Configure connections for data pipeline endpoints. Create user groups to share connection configuration. View information on all data pipelines in the Org. Reference: AutoSync Administration Table 1.0 Org Admin Tasks SnapLogic Monitoring Dashboards The enhanced Monitor interface can be launched from the Apps (Waffle) menu located on the top right corner of the page. The enhanced Monitor Interface enables you to observe integration executions, activities, events, and infrastructure health in your SnapLogic environment. The Monitor pages are categorized under three main groups: Analyze Observe Review Reference: Move_from_Dashboard_to_Monitor The following table lists some common administrative and monitoring tasks for which the Monitor interface can be used. Task Monitor App page Integration Catalog to fetch and display metadata for all integrations in the environment. Monitor -> Analyze -> Integration Catalog Reference: Integration Catalog View of the environment over a time period. Monitor -> Analyze -> Insights Reference: Insights View pipeline and task executions along with statistics, logs, and other details. Stop executions. Download execution details. Monitor -> Analyze -> Execution Reference: Execution Monitor and manage Snaplex services and nodes with graph views for a time period. Monitor -> Analyze -> Infrastructure Reference: Infrastructure View and download metrics for Snaplex nodes for a time period. Monitor -> Analyze -> Metrics Monitor -> Observe -> API Metrics Reference: Metrics, API-Metrics Review Alert history and Activity logs. Monitor -> Review Reference: Alert History, Activity Log Troubleshooting Snaplex / Node / Pipeline issues. Reference: Troubleshooting Table 2.0 Monitor App features Metrics for monitoring CPU Consumption CPU consumption can be high (and exceed 90% at times) when pipelines are executing. A high CPU consumption percentage when no pipelines are executing could indicate a high CPU usage by other processes on the Snaplex node. Review CPU Metrics under the Monitor -> Metrics, and Monitor -> Infrastructure tabs. Reference: CPU utilization metrics System load average (For Unix based systems) Load average is a measure of the number of processes that are either actively running on the CPU or waiting in line to be processed by the CPU. e.g. in a system with 4 virtual CPUs: A load average value of 4.0 means average full use of all CPUs without any idle time or queue. A load average value of >4.0 suggests that processes are waiting for CPU time. A load average value of <4.0 indicates underutilization. System load. Monitor -> Metrics tab. Heap Memory Heap memory is used by the SnapLogic application to dynamically allocate memory at runtime to perform memory intensive operations. The JVM can crash with an Out-of-Memory exception if the heap memory limit is reached. High heap memory usage can also impact other application functions such as pipeline execution, metrics collection, etc. The key heap metrics are listed in the table below: Metric Comments Heap Size Amount of heap memory reserved by the OS This value can grow or shrink depending on usage. Used heap Portion of heap memory in use by the application’s Java objects This value changes constantly with usage. Max heap size Upper heap memory limit This value is constant and does not change. It can be configured by setting the jcc.heap.max_size property in the global.properties file or as a node property. Heap memory. Monitor -> Metrics tab. Non-heap memory consumption The JVM reserves additional native memory that is not part of the heap memory. This memory area is called Metaspace, and is used to store class metadata. Metaspace can grow dynamically based on the application’s needs. Non-heap memory metrics are similar to heap memory metrics however there is no limit on the size of the non-heap memory. In a Snaplex, non-heap size tends to stay somewhat flat or grow slowly over longer periods of time. Non-heap size values larger than 1 GiB should be investigated with help from SnapLogic support. Note that all memory values are displayed in GiB (Gibibytes). Non-Heap memory. Monitor -> Analyze -> Metrics (Node) Swap memory Swap memory or swap space is a portion of disk used by the operating system to extend the virtual memory beyond the physical RAM. This allows multiple processes to share the computer’s memory by “swapping out” some of the RAM used by less active processes to the disk, making more RAM available for the more active processes. Swap space is entirely managed by the operating system, and not by individual processes such as the SnapLogic Snaplex. Note that swap space is not “extra” memory that can compensate for low heap memory. Refer to this document for information about auto, and custom heap settings. Reference: Custom heap setting. High swap utilization is an indicator of contention between processes, and may suggest a need for higher RAM. Additional Metrics Select the node from Monitor -> Analyze, and navigate to the Metrics tab. Review the following metrics. Active Pipelines Monitor the Average and Max active pipeline counts for specific time periods. Consider adding nodes for load balancing and platform stability if these counts are consistently high. Active Pipelines. Monitor -> Analyze -> Metrics (Node) Active Threads Active threads. Monitor -> Analyze -> Metrics (Node) Every Snap in an active pipeline consumes at least one thread. Some Snaps such as Pipeline Execute, Bulk loaders, and Snaps performing input/output can use a higher number of threads compared to other Snaps. Refer to this Sigma document on community.snaplogic.com: Snaplex Capacity Tuning Guide for additional configuration details. Disk Utilization It is important to monitor disk utilization as the lack of free disk space can lead to blocking threads, and can potentially impact essential Snaplex functions such as heartbeats to the Control Plane. Disk utilization. Monitor -> Analyze -> Metrics (Node) Additional Reference: Analyze Metrics. Download data in csv format for the individual Metrics graphs. Enabling Notifications for Snaplex node events Event Notifications can be created on the Manager (Currently in the Classic Manager) under Settings -> Notifications. The notification rule can be set up to send an alert about a tracked event to multiple email addresses. The alerts can also be viewed on the Manager under the Alerts tab. Reference: Notification Events Snaplex Node notifications Telemetry Integration with third-party observability tools using OpenTelemetry (OTEL) The SnapLogic platform uses OpenTelemetry (OTEL) to support telemetry data integration with third-party observability tools. Please contact your CSM to enable the Open Telemetry feature. Reference: Open Telemetry Integration Node diagnostics details The Node diagnostics table includes diagnostic data that can be useful for troubleshooting. For configurable settings, the table displays the Maximum, Minimum, Recommended, and Current values in GiB (Gibibytes) where applicable. The values in red indicate settings outside of the recommended range. Navigate to the Monitor -> infrastructure -> (Node) -> Additional Details tab. Example: Node diagnostics table Identifying pipelines that contribute to a node crash / termination Monitor Page Comments Monitor -> Activity logs Filter by category = Snaplex. Make note of the node crash events for a specific time period Event name text: Node crash event is reported Reference: Activity Logs Monitor -> Execution Select the execution window in the Calendar. Filter executions by setting these Filter conditions: Status: Failed Node name: <Enter node name from the crash event> Reference: Execution Sort on the Documents column to identify the pipeline executions processing the most number of documents. Click anywhere on the row to view the execution statistics. You can also view the active pipelines for that time period from the Monitor -> Metrics -> Active pipelines view. Table 3.0 Pipeline execution review Additional configurations to mitigate pipeline terminations The below thresholds can be optimized to minimize pipeline terminations due to Out-of-Memory exceptions. Note that the memory thresholds are based on the physical memory on the node, and not the Virtual / Swap memory. Maximum Memory % Pipeline termination threshold Pipeline restart delay interval Refer to the table Table 3.0 Snaplex node memory configurations in this Sigma document for additional details and recommended values: Snaplex Capacity Tuning Pipeline Quality Check API The Linter public API for pipeline quality provides additional rules to provide complete reports for all standard checks, including message levels (Critical / Warning / Info), with actionable message descriptions for pipeline quality. Reference: Pipeline Quality Check By applying the quality checks, it is possible to optimize pipelines, and improve maintainability. You can also use SnapGPT to analyze pipelines, identify issues, and suggest best practices to improve your pipelines. (SnapGPT_Analyze_Pipelines) Other third party profiling tools Third party profiling tools such as VisualVM can be used to monitor local memory, CPU, and other metrics. This document will be updated in a later version to include the VisualVM configurations for the SnapLogic application running on a Groundplex. Java Component Container (jcc) command line utility (for Groundplexes) The jcc script is a command-line tool that provides a set of commands to manage the Snaplex nodes. This utility is installed in the /opt/snaplogic/bin directory of the Groundplex node. The below table lists the commonly used arguments for the jcc script (jcc.sh on Linux and jcc.bat on Windows). Note that the command would list other arguments (for example, try-restart). However, those are mainly included for backward compatibility and not frequently used. $SNAPLOGIC refers to the /opt/snaplogic directory on Linux or the <Windows drive>:\opt\snaplogic directory on Windows servers. Run these commands as the root user on Linux and as an Administrator on Windows. Example: sudo /opt/snaplogic/bin/jcc.sh restart or c:\snaplogic\bin\jcc.bat restart Argument Description Comments status Returns the Snaplex status. The response string would indicate if the Snaplex Java process is running. start Starts the Snaplex process on the node. stop Stops the Snaplex process on the node. restart Stops and restarts the Snaplex process on the node. Restarts both the monitor and the Snaplex processes. diagnostic Generates the diagnostic report for the Snaplex node. The HTML output file is generated in the $SNAPLOGIC/run/log directory. Resolve any warnings from the report to ensure normal operations. clearcache Clears the cache files from the node. This command must be executed when the JCC is stopped. addDataKey Generates a new key pair and appends it to the keystore in the /etc/snaplogic folder with the specified alias. This command is used to rotate the private keys for Enhanced Account Encryption. Doc reference: Enhanced Account Encryption The following options are available for a Groundplex on Windows server. install_service remove_service The jcc.bat install_service command installs the Snaplex as a Windows service. The jcc.bat remove_service command removes the installed Windows service. Run these commands as an Administrator user. Table 4.0 jcc script arguments Example of custom log configuration for a Snaplex node (Groundplex) Custom log file configuration is occasionally required due to internal logging specifications or to troubleshoot problems with specific Snaps. In the following example, we illustrate the steps to configure the log level of ‘Debug’ for the Azure SQL Snap pack. The log level can be customized for each node of the Groundplex where the related pipelines are executed, and will be effective for all pipelines that use any of the Azure SQL Snaps (for example, Azure SQL - Execute, Azure SQL - Update, etc.). Note that Debug logging can affect pipeline performance so this configuration must only be used for debugging purposes. Configuration Steps Follow steps 1 and 2 from this document: Custom log configuration Note: You can perform Step 2 by adding the property key and value under the Global Properties section. Example: Key: jcc.jvm_options Value: -Dlog4j.configurationFile=/opt/snaplogic/logconfig/log4j2-jcc.xml The Snaplex node must be restarted for the change to take effect. Refer to the commands in Table 3.0. b. Edit the log4j2-jcc.xml file configured in Step a. c. Add a new RollingRandomAccessFile element under <Appenders>. In this example, the element is referenced with a unique name JCC_AZURE. It also has a log size and rollover policy defined. The policy would enable generation of up to 10 log files of 1 MB each. These values can be adjusted depending on your requirements. <RollingRandomAccessFile name="JCC_AZURE" fileName="${env:SL_ROOT}/run/log/${sys:log.file_prefix}jcc_azure.json" immediateFlush="true" append="true" filePattern="${env:SL_ROOT}/run/log/jcc_azure-log-%d{yyyy-MM-dd-HH-mm}.json” ignoreExceptions="false"> <JsonLogLayout properties="true"/> <Policies> <SizeBasedTriggeringPolicy size="1 MB"/> </Policies> <DefaultRolloverStrategy max="10"/> </RollingRandomAccessFile> … … </Appenders> d. The next step is to configure a Logger that references the Appender defined in step #c. This is done by adding a new <Logger> element. In this example, the Logger is defined with log level = Debug. <Logger name="com.snaplogic.snaps.azuresql" level="debug" includeLocation="true" additivity="false"> <AppenderRef ref="JCC_AZURE" /> </Logger> .. .. <Root> … </Root </Loggers> </Configuration> The value for the name attribute is derived from the Class FQID value of the associated Snap. The changes to log4j2-jcc.xml are marked by the highlighted text in steps c and d. The complete XML file is also attached for reference. You can refer to the Log4j documentation for more details on the attributes or for additional customization. Log4j reference Debug log messages and log files Additional debug log messages will be printed to the pipeline execution logs for any pipeline with Azure SQL Snaps. These logs can be retrieved from Dashboard. Example: {"ts": "2023-11-30T20:21:33.490Z", "lvl": "DEBUG", "fi": "JdbcDataSourceRegistryImpl.java:369", "msg": "JDBC URL: jdbc:sqlserver://sltapdb.database.windows.net:1433;database=SL.TAP;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;authentication=sqlPassword;loginTimeout=30;connectRetryCount=3;connectRetryInterval=5;applicationName=SnapLogic (main23721) - pid-113e3955-1969-4541-9c9c-e3e0c897cccd, database server: Microsoft SQL Server(12.00.2531), driver: Microsoft JDBC Driver 11.2 for SQL Server(11.2.0.0)", "snlb": "Azure+SQL+-+Update", "snrd": "5c06e157-81c7-497f-babb-edc7274fa4f6", "plrd": "5410a1bdc8c71346894494a2_f319696c-6053-46af-9251-b50a8a874ff9", "prc": "Azure SQL - The updated log configuration would also write the custom JCC logs (for all pipelines that have executed the Azure SQL Snaps) to disk under the /opt/snaplogic/run/log directory. The file size for each log file and the number of files would depend on the configuration in the log4j2-jcc.xml file. The changes to log4j2-jcc.xml can be reverted if the additional custom logging is no longer required. Log level configuration for a Snaplex in Production Orgs The default log level for a new Snaplex is ‘Debug.’ This value can be updated to ‘Info’ in Production Orgs as a best practice. The available values are: Trace: Records details of all events associated with the Snaplex. Debug: Records all events associated with the Snaplex. Info: Records messages that outline the status of the Snaplex and the completed Tasks. Warning: Records all warning messages associated with the Snaplex. Error: Records all error messages associated with the Snaplex. Reference: Snaplex logging PlexFS File Storage considerations PlexFS also known as suggest space is a storage location on the local disk of the JCC node. The /opt/snaplogic/run/fs folder is commonly designated for this purpose. It is used as a data store to temporarily store preview data during pipeline validation, as well as to maintain the state data for Resumable pipelines. Disk volumes To address issues that cause disk full errors and to ensure smoother operations of the systems that affect the stability of the Groundplex, you need to have separate mounts on Groundplex nodes. Follow the steps suggested below to create two separate disk volumes on the JCC nodes. Reference: Disk Volumes The /opt/snaplogic/run/fs folder location is used for the PlexFS operations. mount --bind /workspace/fs /opt/snaplogic/run/fs Folder Structure: The folders under PlexFS are created with this path structure: /opt/snaplogic/run/fs/<Environment>/<ProjectSpace>/<Project>/__suggest__/<Asset_ID> Example: /opt/snaplogic/run/fs/Org1/Proj_Space_1/Project1/__suggest__/aaa5010bc The files in the sub-folders are created with these extensions: *.jsonl *.dat PlexFS File Creation The files in /opt/snaplogic/run/fs are generated when a user performs pipeline validation. The amount of data in a .dat file is based on the “Preview Document Count” user setting. For Snaps with binary output (such as File Reader), the Snap will stop writing to PlexFS when the next downstream Snap has generated its limit of Preview data. PlexFS File Deletion The files for a specific pipeline are deleted when the user clicks ‘Retry’ to perform validation. New data files are generated. Files for a specific user session are deleted when the user logs out of SnapLogic. All PlexFS files are deleted when the Snaplex is restarted. Files in PlexFS are generated with an expiration date. The default expiration date is two days. The files are cleaned up periodically based on the expiration date. It is possible to set a feature flag to override the expiration time, and delete the files sooner. Recommendations The temp files are cleaned up periodically based on the default expiration date however you might occasionally encounter disk space availability issues due to excessive Preview data being written to the PlexFS file storage. The mount directory location can be configured with additional disk space or shared file storage (e.g. Amazon EFS). Contact SnapLogic support for details on the feature flag configuration to update the expiration time to a shorter duration for faster file clean up. The value for this feature flag is set in seconds.
ramaonline
6 months ago Place Sigma Framework Library
5.9KViews
4likes
0Comments