Introduction
This article provides information about I/O errors that may occur in hardware such as RAM, CPU, network, or disk in Logsign server.
CPU Control
Services such as Elasticsearch, Logsign-parser, or Logsign-alarmflow consume top-level CPU in Logsign server. In some cases, the server's CPU may not be sufficient.
You can check CPU usage or load average with Htop or top. If high usage values continue constantly, you can perform a CPU increase process.
If the CPU is insufficient, you may experience slowness in your user interface and all other operations.
In addition to the following Logsign services, it is not possible for a process outside Logsign to constantly consume CPU. This process may be a package or agent loaded onto the Logsign server, and you can ease CPU usage by removing unimportant packages and agents from the server.
RAM Control
Services such as Logsign Elasticsearch, Logsign-parser, or Logsign-alarmflow use top-level RAM. In some cases, RAM usage reaches the server's capacity limit and falls into swap area usage. The problem here is the occurrence of 80-100% usage of your 4GB swap area. If RAM usage has not increased due to an abnormal process other than Logsign services, you should perform a RAM upgrade process.
RAM usage status is critical. If RAM status is insufficient, the operating system services will be shut down, and problems will occur in the connected services.
Disk Control
The read/write speed of the disk in Logsign servers is important, and slowness on the current disks can affect the entire system and even cause log interruption. We can perform disk I/O control with the following command.
iotop
You can view the instant I/O rates of your disk on this graph.
You can examine the existing disk for problems by testing the read/write speed of your disk with the following command.
dd if=/dev/zero of=/tmp/logsign.test bs=1M count=1024
In the above example, we examined the read/write speed of a 1GB file being written to the disk instantly. If this speed drops below an average of 100 MB/s, check your disk.
The disk latency is checked with the following command. It is recommended that the average should not be below 1 MB in the test results.
dd if=/dev/zero of=/tmp/test2.img bs=512 count=1000 oflag=dsync
Network Control
You can perform a quick test by pinging the gateway address on all network cards in your Logsign servers.
Ping response time should not exceed 2.0 ms.