Offline Report Cluster Architecture - Management

Updated November 16, 2023 08:09

Introduction

This article discusses Logsign's offline report and management in a clustered environment.

Spark Technology

Spark is an open-source framework designed to perform faster analysis.

Logsign Offline Report

Logsign stores the large data archive logs it needs for offline reports using Hadoop technology. When the Offline Report process is initiated, Hadoop archives data is presented to the offline-worker service.

Offline Worker and Offline Master services are the services used for offline report.

Logsign Offline Master

The Offline Master service manages the Spark Master service in the background, which manages the Spark-worker services that re-index the fragmented and distributed archive logs using the processing power of the workers.

The Offline Master service should only be on one server.

Logsign Offline Worker

Manages the Spark worker service in the background, which is the actual service that processes the log for offline reports.

Spark worker services support working with multiple services, and their performance is directly proportional to server and hardware capabilities. In other words, the more Spark worker services you open, the faster the offline report process.

During processing, Spark worker services can use a lot of CPU and RAM, depending on the compressed data ratio.

To increase Spark worker services, you should use the cluster panel. You can increase the Spark worker services by clicking the Edit button on servers with Offline worker.

The number of Spark worker services is distributed by CPU multiplier. One Spark worker means one CPU usage.

When the process is finished, click the Save Plan button and run the apply_plan script on the terminal screen.

Introduction

Spark Technology

Logsign Offline Report

Logsign Offline Master

Logsign Offline Worker

Articles in this section