-
Overview
A
module
in Filebeat is a way to parse a specific log file format for a particular software. -
Pipeline
A
pipeline
is a definition of a series ofprocessors
that are to be executed in the same order as they are declared.A pipeline consists of two main fields:
-
a
description
:The
description
is a special field to store a helpful description of what the pipeline does -
a list of
processors
:The
processors
parameter defines a list of processors to be executed in order
{ "description" : "...", "processors" : [...] }
-
-
Painless scripting language
Painless is a simple, secure scripting language designed specifically for use with Elasticsearch.
It is the default scripting language for Elasticsearch and can safely be used for inline and stored scripts.
-
Mapping
Mapping is the process of defining how a
document
, and thefields
it contains, are stored and indexed.A mapping definition has:
-
Metadata fields are used to customize how a document’s asspciated metadata is treated.
Examples of metadata fields include the document’s
_index
,_id
, and_source
fields. -
A mapping contains a list of
fields
or properties pertinent to the document.Each field has its own
data type
.
Defining too many fields in an index can lead to a mapping explosion, which can cause out of memory errors and difficult situations to recover from.
There are two way to implement mapping:
-
One of the most important features of Elasticsearch is that it tries to get out of your way and let you start exploring your data as quickly as possible.
To index a
document
, you don’t have to first create anindex
, define a mappingtype
, and define yourfields
- you can just index adocument
and theindex
,type
, andfields
will spring to life automatically.The automatic detection and addition of new fields is called dynamic mapping:
-
Explicit mapping
-
-
data replication model
Each
index
in Elasticsearch is divided intoshards
and eachshard
can have multiple copies.These copies are known as a
replication group
and must be kept in sync when documents are added or removed.The process of keeping the shard copies in sync and serving reads from them is what we call the data replication model.
This model is based on having a single copy from the replication group that acts as the
primary shard
, the other copies are calledreplica shards
. -
Ingest node
The built-in modules are almost entirely using the
Ingest node
feature of Elasticsearch instead of the Beats processors.One of the most helpful parts of the ingest pipeline is the ability to debug by using the Simulate Pipeline API.
The simulate pipeline API executes a specific pipeline against a set of documents provided in the body of the request.
You can either specify an existing pipeline to execute against the provided documents or supply a pipeline definition in the body of the request.
You can use the simulate pipeline API to see how each processor affects the ingest document as it passes through the pipeline. To see the intermediate results of each processor in the simulate request, you can add the
verbose
parameter to the request.The Ingest pipeline works on a document level, you still need to check for exceptions where the logs are generated and let Filebeat create a single message out of that.
-
Suricata fields
-
References
再次理解ElasticSearch
猜你喜欢
转载自blog.csdn.net/The_Time_Runner/article/details/113001917
今日推荐
周排行