It's highly recommend that the log management systems must require high system specifications. The reason is for collecting all the data without missing inside of the architecture and providing the most efficient data analysis.
Most of the companies that have their own hardwares, calculate on the minimum system requirements. In Logsign architecture, checking these requirements is also very essential before the installation. Logsign doesn't cause any limitations while collecting all the data. This is why these specifications must be calculated.
Logsign doesn't have an EPS limit for getting all the data from all the sources, so it guarantees to get all the logs without losing any of them. So it makes the calculation of the system specifications more important.
Logsign support engineers request the source information to be collected from you log to determine EPS. Depending on the EPS value and source types, your hardware requirements are determined.
For example, the disk requirement of a system of 1000 EPS is calculated as follows.
How Calculate EPS?
EPS values are never calculated with %100 accuracy but can be determined average values. Even when the system is fully integrated and all the logs are collected, it is impossible to give the EPS values of the next or other “T” moment with definite numbers before the logs arrive. The reason of this the amount of data generated in the system changes constantly.
For example, when the traffic log of the users with access to the internet on a certain period is at the value "x", the traffic may decrease to the value of "y" with decrease in traffic at noon.
Logsign offers users the possibility to determine EPS values in their systems. Determination of these values at the beginning of the installation is important for the continuity of the installation by smoothly.
Please note that: For EPS calculation please contact with Logsign Customer Services for help.
Effects of EPS Values on Disk Size
EPS values give an average idea of disk size calculation. However, the calculation logic may vary according to the sources from which the data is received. The templates used to calculate average values are as follows:
One (1) line of "Firewall" data is about 0.3 Kb in average (may vary according to firewall manufacturers). When these values are subjected to normalization, the log management solution begins to store each row in its own system and its size reaches a value between 0.7 and 1 Kb on average (while the traffic log produces a small amount of data, login or which has multiple characters logs reach large numbers).
One (1) "Microsoft" data may vary between 0.8 and 1 Kb. The reason of this there are too many parametres in the data.
This can be done on systems using "NoSQL" like Logsign architecture. However, for manufacturers who depend on traditional SQL architectures, the numerical dimensions of these values can be threefod or higher.
Calculate Disk Size
Based on the above information, we will use a simple formula to calculate the average disk size that the system will use.
To calculate the minimum disk space that our system will use in Logsign architecture, the amount of data to be kept on the "index" and the amount of data to be stored in the archive must be determined. It is recommended that the logs kept on the index are kept for at least fifteen days,
Especially since the fifteen day data on security is critical. Therefore, it is recommended that the logs kept on the index are kept for at least fifteen.
Let’s explain the relationship (link) between index and archive data. Index is the section in which data is compressed. The archive represents another partition where the same data is compressed. Although the compression ratio at this point varies according to the type of compression, it is based on a 1/10 to 1/20 ratio in these calculations around the world.
Please note that: Some data may be compressed 1/40, on the other hand some data may not be compressed at all.
Calculation Formula
Total Data = Daily Index Size * Day Count + Daily Archive Size * Day Count
For example, the index duration is held for 15 days in a place with one hundred (100) GB of data a day. Suppose the logs will be kept for one (1) year. The average compression ratio for this system is thought to be 20 times.
You can get more detailed information with formula.
Live Data – Index Size = 100 GB * 15 = 1.5 TB
Archive Data = 5 GB * 365 = 1.8 TB
Total Data Size = Live Data + Archive = 1.5 TB + 1.8 TB = 3.3 TB
Please note that: It is recommended to have an extra 500 GB of space in case of possible increase in peak values in the system.
Factors which that affecting disk space (size) calculation:
Normalization of log: If grographical information is availble in the log, thle log size will grow. The high number of rows generated by the manufacture, it affects the size of the data.
Filtered Values: If useless data is specified in the log, these values are substracted from the logs. Hence, some of the data in the log is extracted, so it can be stored in smaller sizes.
Redundancy: It is an optional feature that can bu used in the Logsign architecture. If more than the same log is found in the same time period, it may not store some of these logs. For example, if you get fifteen (15) pieces logs which same data for a period in one (1) second, you can save only 5 of them.
Please note that: For more detailed information related to calculation methods, please contact with Logsign Customer Services.