5. Data Flow

Beginning with v1.5.1, NetSpyGlass server starts Python hook script that performs calculations with monitoring data immediately after the monitors or secondary servers that feed data to it complete their push. This speeds up data flow through NetSpyGlass servers that form the cluster.

In versions prior to 1.5.1, all monitors started data collection, and all servers collected data from monitors and secondary servers at the same time. For example, in systems running with 1 min polling interval, the sequence started at the beginning of each minute. If monitors started their cycle at the time 00:00 and completed few seconds before 00:01, the secondary servers collected data at 00:01 and then ran Python hook script to perform calculations on it. Calculations would take a few seconds, after which results are available to be collected by the primary server, however this collection did not happen until the beginning of the next cycle, at 02:00. Primary server required some time to perform its own calculations too, which means data has become available for the UI and alerts few seconds past 02:00, or with a delay of over 2 min after the start of the cycle. If primary server fed data to a dedicated alerts server, that added delay of one more polling cycle, making alerts over 3 min late. In clusters running at 30 sec polling interval delay from a monitor to primary sever was over 90 sec. However this delay was most pronounced in systems running with longer polling interval. For example in a system built with just one monitor and one server and running with 5 min polling interval data was available to the UI and monitors 5 min late. In the three-tier cluster built from monitors, secondary servers and the primary server data was available over 10 min late.

This release improves data flow speed by making serves perform calculations immediately after monitors or lower level servers that feed data to them signal the end of the data push. If the cluster is running with 1 min polling inerval and monitors spread SNMP polling over 50 sec, secondary servers start their calculations as soon as monitors finish uploading data, or approximately on the 51-st second of the cycle. Secondary servers push data to the primary immediately after completion of their calculations and the primary, in turn, begins its calculations as soon as secondary servers complete their push. With this imprvement, data becomes available in the primary server after 1 minute and a few seconds after beginning of the cycle, compared to the delay of over 2 minutes in the old versions. In clusters running with 5 min polling interval, data becomes available with a delay of 1 min instead of 5 min.

The following diagram illustrates the sequence of events that happen in the cluster during one polling cycle:

Primary server signals beginning of the cycle to all cluster members
at these moments:

       |                                  |
       |                                  |
   ----|----------------------------------|------------------------>
       | beginning of cycle N             | beginning of cycle N+1
       t1                                 t2


Monitor time line:
SNMP responses arrive from devices at these moments
(polling is spread out over  extended period of time
to avoid device CPU overload)

         |  |  |  |  |  |  |  |  |  |
         v  v  v  v  v  v  v  v  v  v
   ----|----------------------------------|------------------------>
       |                             |  | | beginning of cycle N+1
      t1                             |  | t2
                                     |  |
monitor data push. all               |  |
observations have time stamp t1   -> |  |
                                     |  | <-- monitor signals end of push
Secondary server time line:          |  |
                                     V  V
   ----|----------------------------------|----------------------------------|-->
       t1                                ^ t2    ^ |                         t3
     beginning of cycle N                |       | |
                                         |       | |
secondary server unpacks data and  ------+       | |
begins computations. Note that this              | |
may happen after the beginning of                | |
the next cycle (past time t2)                    | |
                                                 | |
sec. server finishes computations   -------------+ |
and begins data push to the primary                |
                                                   |   |  +--- push from other sec. servers
sec.server finishes data push and signals   -----> |   |  |
end of push                                        |   |  |
                                                   |   |  |
                                                   |   |  |
Primary server time line:                          |   |  |
                                                   V   V  V
   ----|----------------------------------|----------------------------------|-->
      t1                                  t2              ^    ^
                                                          |    |
                                                          |    |
primary received data push from all                       |    |
secondaries it expects to get it from    -----------------+    |
and begins computations                                        |
                                                               |
compputation results are available to the UI   ----------------+
and APIs. This is the end of the cycle