How NSM5200 Network Storage Managers are Promoted

Issue

When a manager of an Endura storage pool goes offline, one of the other NSM5200s in the pool will be promoted to be the new manager.

Product Line

Pelco Video Management

Environment

NSM5200 Network Storage Manager, all versions

Cause

An NSM5200 uses several criteria to choose a new manager among several member NSM5200s.

Resolution

In early versions of the NSM5200, when a NSM5200 pool manager failed, the member with the lowest IP address would assume the management role. That has changed. The new method has 3 tiers of verification:

  1. Checking the priority level. By default all NSMs have a priority of 100. When you go into the Web interface to program a unit as a manager, that priority gets increased to 110. When you configure it as a member, it is decreased to 90. If a manager fails, and a member takes over, that member's priority will increase to 100. If the 110 manager comes back online as a member, and the manager fails again, that 110 unit will automatically take the manager roll back over because it has higher priority. You can check the priority by screening into the pal runtime and typing clients. To do so, first open an SSH into the NSM5200 using the instructions in Lessons Learned Entry 13089.

      
    [root@Manager root]# screen -r pal
    pal# clients
    
    ConfigDB is online
    Address         | Uptime       | Priority | UUID                                 | Master
    -----------------------------------------------------------------------------------------
    
    *172.30.108.191 | 17104.968949 | 110      | 757ae3ae-315c-49fe-9d63-baaec44ed717 | YES
    
    172.30.108.197  | 16579.544636 | 90       | 00c58b43-be23-464b-a675-d42324aa8ed9 | NO
    


    Priority  Meaning

    -1  Device is in the pal’s server list but has never come online
    0  Device has been seen, but is now offline
    10  Device has an invalid or old database
    90  Device has been configured once though the Web interface as a member
    100  Default priority when a unit comes online the first time
    110  Device has been configured once through the Web interface as a master
  2. Checking the last update to the configuration database (CDB). All units will check their latest incremental update of the CDB. If they have the most up-to-date increment, they are in contention to become the new manager.
  3. Checking the UUID. This is going to be the most common tie breaker, as most members will all match for the first two criteria. The unit with the highest UUID (starting value) will become the manager. The scale goes from 0 to F, F being the highest. The unit with the highest UUID will become the manager (e.g., UUID 5xxxxxx would lose to UUID Fxxxxxx).