1.18. Release Notes 1.3.0

NetSpyGlass v1.3.0

1.18.1. New features

  • NSG server creates log file logs/var_diff.log where it writes “diff” computed on monitoring variables pushed by monitors and secondary servers. Diff is also computed at the end of the python rules script run. Changes in device id, name, index and other parameters of the data source, as well as changes in the set of tags are recorded. The diff is expensive to compute and log, therefore this is only done if debug level is set to the value greater than 3.

  • Server is now Java8 ready

  • We have added ability to choose which tag facets are shown in the input field “Filter by tags” in Graphing Workbench. Click on the new UI control (just to the left of the logical operation selection “Contains all of” / “Contains any of” etc) opens list of tag facets with checkboxes. Turning checkbox off in this list suppresses tag facet in the filter. This should help the user reduce the length of the tag filter on installations with large number of devices and interfaces. Simply turn off facets you never use to make the filter more responsive.

  • A change in the view builder python hook script: function execute() of the class declared in this script is now passed an argument, which is a list of all devices defined in the system as list of PyDevice objects. This function should be declared in the script as follows:

    class UserViewBuilder(object):
        """
        NetSpyGlass uses this class and its functions as an API to the view
        configuration managed by the user.
    
        :param log:   system logger object
        """
    
        def __init__(self, log):
            self.log = log
    
        def execute(self, devices):
            """
            Generate list of View objects to describe views. Views can
            refer to each other by name using attribute "parent" to establish
            hirarchy. This function only creates "blank" views, it does not add
            devices to them.
    
            :param devices: list of :class:`PyDevice` objects (all devices monitored by NetSpyGlass)
            :rtype : list of strings
            :return:  list of View objects
            """
    

    List of devices can be used to generate views automatically, based on the device naming convention or addressign scheme.

  • new monitoring variable alertCount has been intorduced. The value of this variable is a total number of active alerts for every device.

  • This release introduces support for aggregate device-related variables for nested views. This is similar to how we calculate aggregate values for interface-related variables such as ifInRate, ifOutRate etc, except this applies to device variables. To calculate the value, call python function compute_sum_for_clusters() in your python data processing rules script. Just like in the case of interface-related variables, aggregate value is computed as a sum of values of all contributing variables. For each view, we add up values of the variables for all devices in the view and all nested views inside of it, recursively. There is no limit to the depth of the views hierarchy for the purposes of this calculation. Since aggregate value is calculated as a sum, it makes sense only for certain types of variables. Currently this is calculated for the following variables: alertCount, minorChassisAlarm and majorChassisAlarm.

  • default python rules hook script has been refactored to put all calls to nw2functions.compute_sum_for_clusters() in one place - function interface_var_aggregates(). Functions device_var_aggregates() and interface_var_aggregates() calculate values of aggregate variables created for objects that represent nested views in network maps. In cluster configurations where all basic calculations (interface stats rates and so on) are performed by the secondady servers, python rules hook script of the primary server does not have to call execute() of the base class anymore, provided it calls functions device_var_aggregates() and interface_var_aggregates():

    class UserRules(nw2rules.Nw2Rules):
        def __init__(self, log):
            super(UserRules, self).__init__(log)
    
    def execute(self):
    
        # there is no need to call this now
        # super(UserRules, self).execute()
    
        # but we need to call these instead
        self.device_var_aggregates()
        self.interface_var_aggregates()
    

1.18.2. Bug fixes

  • NSGDB-22 add “Link.” tag to the variables associated with cluster interfaces

  • several failure scenarions associated with data push from monitors and secondary servers have been tested and fixes have been implemented

  • NSGDB-42 fixed support for discovery of LLDP links between aggregator ports Cisco Nexus -> Cisco Nexus and Cisco Nexus -> Arista

  • NSGDB-35 server data push should work when parameter variables in the push clause in cluster.conf file has value [“*”]

  • NSGDB-41 changed the mechanism used to get the name of the input variable in alerts. Unfortunately universally “correct” way to handle this does not exist. Variable names are external to the variable objects and rule processor does not pass variable-name bindings to functions, it only passes the variable itself. Variable objects are pooled and reused, which means the same variable obvject may be bound to different names from one processing cycle to the next. Also, variables can be created if python rules that call new_var(). These variables do not have the name at all (they are “anonymous” until the call to export_var()) and if they are used as an input for the call to alert(), their name is impossible to get. However I was able to implement a method that will get the name of the original variable in simple cases. For example, it works if the input variable in the call to alert() was obtained by calling import_var() or passed through some simple calculations. Example:

    alert(
        name='dataProcessorOutOfTime',
        input=import_var('freeTime'),
        condition=lambda mvar, x: x < 20,
        description='data processor free time is under 20 sec',
        duration=0,
        notification_time=3600,
        streams=['log'],
        fan_out=True
    )
    

    here input variable has name “freeTime” and since it was just imported and used, the name can be found and will appear in fields “inputVariable” and “values” of the alert. The name will “survive” if the input variable passes through simple calculations, such as arithmetics operations, but may be lost if calculations involve multiple variables.

  • NSGDB-44 we should be able to discover links that go over aggregator ports when LAG discovery data is not available and we can’t discover links that go over aggregation ports

  • (bug with no number, related to NSGDB-44): fixed discovery of LAG aggregator interface links between Juniper and Arista devices. Arista reported Juniper’s physical port (e.g. xe-0/3/0) as LLDP “remote” port, but on Juniper LAG aggregation ports are subinterfaces, so we need to scan all subinterfaces of xe-0/3/0 to find corresponding aggregator “ae” interface.

  • NET-1113 this bug caused problems with synchronizing lists of devices between NSG servers and monitors right after restart.

  • NSGDB-46 improve method used to discover software revision on Cisco NX-OS and Arista devices

  • NET-1129 improve method used to save views to the database to make it more reliable on the installations with large number of views and maps. This resolves the following exception:

    org.hibernate.TransientObjectException: object references an unsaved transient instance - save the transient instance before flushing: net.happygears.nw2.db.hbm.ViewModel