9.2. Grok¶
Syslog messages are simply free formatted text lines that originally were not designed to be processed by machines. Vendors have different formats of Syslog messages that they can change anytime. This is the reason NSG Agent utilizes technology called Grok that enables it to extract meaningful information from unstructured text. Grok can be used to build a set of patterns and apply them to incoming messages; when a pattern matches the message it is used to extract key-value data pairs from the message.
One can think of Grok patterns in terms of regular expression capturing groups. Unlike capturing groups, Grok pattern defines both the matching expression and the key name in the key-value pair that will be produced when it finds matching text in the input string. Grok patterns can have names and can refer other Grok patterns defined elsewhere by their names.
For example, syslog message pattern could be defined as
SYSLOGLINE %{SYSLOGBASE2} %{GREEDYDATA:message}
This pattern has name SYSLOGLINE
and refers two other patterns: SYSLOGBASE2
and GREEDYDATA
. When pattern
SYSLOGLINE
is applied to a text string, it matches only when both SYSLOGBASE2
and GREEDYDATA
match.
Basically, instead of typing complex patterns defined inside of SYSLOGBASE2
and GREEDYDATA
, we reuse them
to build new pattern SYSLOGLINE
which works as if we concatenated patterns SYSLOGBASE2
and GREEDYDATA
.
Ability to refer to patterns by their name provides for a powerful reuse mechanism, as we are going to see in the example
below.
NetSpyGlass comes with built-in library of Grok patterns that you can use to build new ones to parse your syslog messages.
SYSLOGBASE2
looks like this:
SYSLOGBASE2 (?:%{SYSLOGTIMESTAMP:timestamp}|%{TIMESTAMP_ISO8601:syslog5424_ts}) (?:%{SYSLOGFACILITY} )?%{SYSLOGHOST:log_source}+(?: %{SYSLOGPROG}:|)
(note that this pattern is quite long and you probably need to scroll the text horizontally to see it all)
SYSLOGTIMESTAMP
is defined as follows:
SYSLOGTIMESTAMP %{MONTH} +%{MONTHDAY} %{TIME}
MONTH \b(?:[Jj]an(?:uary|uar)?|[Ff]eb(?:ruary|ruar)?|[Mm](?:a|ä)?r(?:ch|z)?|[Aa]pr(?:il)?|[Mm]a(?:y|i)?|[Jj]un(?:e|i)?|[Jj]ul(?:y|i)?|[Aa]ug(?:ust)?|[Ss]ep(?:tember)?|[Oo](?:c|k)?t(?:ober)?|[Nn]ov(?:ember)?|[Dd]e(?:c|z)(?:ember)?)\b
MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
HOUR (?:2[0123]|[01]?[0-9])
MINUTE (?:[0-5][0-9])
SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
Pattern MINUTE
is a simple regular expression that can be referred from TIME
that in turn
is referred from SYSLOGTIMESTAMP
.
Consider the following Grok pattern
Line protocol on Interface %{INTERFACENAME:logIfLocalName}, changed state to %{INTERFACESTATUS:logIfLocalOperStatus}
This pattern is designed to match very specific syslog message that may look like this:
Line protocol on Interface GigabitEthernet0/10, changed state to down
The pattern copies fragments of syslog message that should match verbatim, such as parts Line protocol on Interface and changed state to, but uses matching patterns to “catch” parts that change from device to device and from interface to interface. The pattern matches syslog message when both fixed parts and patterns match, and then produces two key-value pairs, one with key logIfLocalName and the other with key logIfLocalOperStatus.
Application of this pattern produces the following dictionary that later becomes part of the document sent to ElasticSearch.
{
logIfLocalName: "GigabitEthernet0/10",
logIfLocalOperStatus: "down"
}
However, before the document is submitted to ElasticSearch, it is “enriched” with meta-data that belongs to the device that generated this syslog message in the first place. This happens completely automatically “behind the scene”. This is how document that corresponds to the original syslog message that we store in ElasticSearch can be associated with a device object used by NetSpyGlass, which in turn allows the user to build dashboards and alerts that combine syslog data and time series data collected from the device with polling.
Continuing with the example of syslog message shown above, we can use it to count how many times the same network interface of the same device went “down” over given period of time. Below, we are going to see how this can be done.
Parsing of the input message is done in a cascading manner, where the system applies regular expressions defined by all Grok patterns used to compose the pattern we use. To do this, Grok parser converts original pattern into regular expression by replacing named patterns with their values.
So, if Grok patterns INTERFACENAME
and INTERFACESTATUS
are defined as
INTERFACENAME [a-zA-Z][a-zA-Z0-9\\-./]+
INTERFACESTATUS (up|down)
the resulting regular expression becomes
Line protocol on Interface (?P<logIfLocalName>[a-zA-Z][a-zA-Z0-9\\-./]+), changed state to (?P<logIfLocalOperStatus>(up|down))
Since Grok allows one to build patterns that refer other patterns, we can build a catalog of Grok expressions, for example:
WORD [a-zA-Z][a-zA-Z0-9\\-./]+
INTERFACENAME %{WORD}
INTERFACESTATUS (up|down)
NetSpyGlass comes with a catalog of pre-built Grok expressions to match typical patterns found in the
log messages generated by network devices made by different vendors. Customers can extend this catalog by adding their
own patterns and Grok expressions to match specific log messages as needed. These should be added to configuration
file grok.conf
as explained in Configuration.
NetSpyGlass provides command-line tool nsggrok
that helps to test custom or pre-built Grok patterns.
Next example shows parsing result for message hello world
when we apply pattern hello %{WORD:name}
using command line tool nsggrok
:
nsggrok --pattern "hello %{WORD:name}" text "hello world"
{
"name": "world"
}
Script nsggrok
is part of the Open Source Python package nsgcli
( https://github.com/happygears/nsgcli ).
Next example demonstrates what happens when we try to parse syslog message using only pre-built patterns.
Input syslog message is
<13>May 18 11:22:43 carrier sshd: SSHD_LOGIN_FAILED: Login failed for user 'root' from host '10.1.1.1'
:
nsggrok log "<13>May 18 11:22:43 carrier sshd: SSHD_LOGIN_FAILED: Login failed for user 'root' from host '10.1.1.1'"
{
"sshUser": "root",
"index": "labdcdev-syslog-short",
"sshSrcIp": "10.1.1.1",
"logText": "SSHD_LOGIN_FAILED: Login failed for user 'root' from host '10.1.1.1'",
"prio": "13",
"logSyslogSeverityName": "notice",
"logSource": "carrier",
"logSyslogText": "<13>May 18 11:22:43 carrier sshd: SSHD_LOGIN_FAILED: Login failed for user 'root' from host '10.1.1.1'",
"logSyslogFacilityName": "user",
"logTimestamp": "2021-05-18T11:22:43.000Z",
"program": "sshd",
"logSyslogPriority": 13,
"timestamp": "May 18 11:22:43",
"logSyslogFacilityCode": 1,
"logSyslogSeverityCode": 5
}
As you can see, built-in Grok patterns could extract a lot of useful information from this syslog message.
Key logSource describes the device that sent this syslog message. All we can learn about it from the message
itself is its name carrier
.
The complete list of possible usage scenarios is available at help nsggrok -h
.