1. Improvements in Architecture

Filebeat VS Logstash:

Filebeat and Logstash provide inconsequential transports for centralizing and forwarding log information. It supports maintaining simple objects by providing a very easy way to manage and centralize files, folders and logs. In other words, Logstash is described as collecting, parsing, and augmenting data. The important difference between Logstash and Filebeat is their functionality, Filebeat consumes fewer resources. But in general, Logstash consumes various inputs, and dedicated beats do the job of collecting data with minimal RAM and CPU.

In order to improve the efficiency of the ELK stack, it is best to use the combination of these two tools together. Logstash acts as an aggregator – pulling data from various sources and pushing it into a pipeline, usually into Elasticsearch, but also into a buffer component in large production environments. It’s worth mentioning that the latest version of Logstash also includes support for persistent queues when storing message queues on disk. On the other hand, Filebeat and other members of the Beats family act as lightweight agents deployed on edge hosts, pumping data into Logstash for aggregation, filtering, and enrichment.

2. Improvements on Elasticsearch

What is Elasticsearch?

Elasticsearch is a distributed document store based on the Apache Lucene library specializing in full-text search for schema-free documents and provides access to raw event-level data. 

  • Cluster:

ElasticSearch can be used as a standalone single search server. However, in order to handle large data sets and achieve fault tolerance and high availability, ES can run on many servers that cooperate with each other. The collection of these servers is called a cluster.

  • Node:

Each server that forms a cluster is called a node

  • Shard:

When there are a large number of documents, one node may not be enough due to memory limitations, insufficient disk processing power, inability to respond to client requests fast enough, etc. In this case, the data can be divided into smaller slices. Each slice is put on a different server.
When the index of your query is distributed over multiple shards, ES sends the query to each relevant shard and combines the results without the application being aware of the existence of the shards. I.e., the process is transparent to the user.

  • Replia:

To increase query throughput or achieve high availability, you can use a sharded replica.
A replica is an exact copy of a slice, and each slice can have zero or more copies. there can be many identical slices in ElasticSearch, one of which is selected to change the index operation, and this particular slice is called the master slice.

Pre-filter implementation principle:

The data node determines whether there is an intersection between a Range query and sharding, depending on an important feature of Lucene: PointValues.

Pre-filter will not be executed in all query processes. It will be executed only if the following conditions are met at the same time:

  1. The number of shards to be queried is greater than 128 (specified by pre_filter_shard_size)
  2. Aggregation requests do not require access to all docs. 

In addition, although the numerical query of the non-Date type will also go through the pre-filter process, it will not judge the range internally.

In order to improve on the current ElasticSearch, the pre-filtering phase can be adapted to send only one request per node in this phase, covering all slices on the node. If a cluster has thousands of slices on three data nodes, the number of network requests will change from thousands to three or less during the initial search phase, regardless of the number of slices searched.

Improvement in Network

Each Elasticsearch node has two different network interfaces. Clients use its HTTP interface to send requests to Elasticsearch’s REST APIs, but nodes use the transport interface to communicate with other nodes. The transport interface is also used for communication with remote clusters. We come to acknowledge an abstract application programming interface (API) for the transport layer that enables dynamic selection of transport protocols and network paths at runtime. This API enables faster deployment of new protocols and protocol features without requiring modifications to the application. The specified API follows the transport service architecture and provides asynchronous, atomic messaging. It is intended to replace the BSD socket API as a common interface to the transport layer, in an environment where endpoints can choose from multiple interfaces and potential transport protocols. We want to apply this abstraction layer into the process of ElasticSearch in order to improve its efficiency.

[1] https://logz.io/blog/filebeat-vs-logstash/

[2] https://www.educba.com/filebeat-vs-logstash/