How to perform NSM5200 Filesystem xfs repair.

NOTICE

POTENTIAL FOR DATA LOSS.
The steps detailed in the resolution of this article may result in a loss of critical data if not performed properly. Before beginning these steps, make sure all important data is backed up in the event of data loss. If you are unsure, please contact Product Support Services prior to attempting the procedure below.

NOTICE

COMPLEX PROCEDURE REQUIRED.
The resolution of this article has many complex steps that may result in unforeseen results if not performed correctly. If you are at all unfamiliar with the requirements, please contact Product Support Services for assistance.

Issue

  • No color in video playback timeline.
  • Unexplained color-gaps video playback timeline.
  • Sudden significant shortage in usual video retention time.
  • Endura Diagnostics reports errors in volumes, offline volumes, core dumps or xfs errors.

Product Line

Pelco Video Management

Environment

  • Pelco Endura NSM5200.
  • Pelco Endura EE500

Cause

  • Failed/Failing Hard Disk Drive(s).
  • Failed/Failing RAID Array Adapter(s).
  • Excessive Heat or Vibration in installed environment
  • Power outages or unstable/unclean power in installed environment.
  • Ungraceful restart/reboot of unit(s).
  • Compounding of minor Filesystem Corruptions caused by normal product use over time aka "wear and tear".
  • Possible Hardware or Software defect.

Resolution

note:  Pelco does not recommend these steps be attempted without supervision from a Pelco Product Support Technician either on the phone or via Remote Desktop Session.  
note:  Exactly typed command syntax or selections are given in blue
note:  Except where directed, proceed through the steps of this article from top to bottom, in order they are presented. 

Section 1: How to repair all Data Volumes on an NSM5200

1. Launch Endura Utilities, log in and press Search.
note: The default login credentials for Endura Utilities version 2.2 and below is [Username: Administrator and Password: configapp]. Endura Utilities version 2.3 and greater removed that unique login credential requirement,
authenticating instead against the standard Endura Credentials used for WS5XXX and VCD5200; The default login in that case is [Username: admin and Password: admin].

2. In the System Attributes tab, right-click the NSM5200 in question and select SSH Into.

note:  The default Linux Administrative login credentials for Endura Linux devices is [Username: root and Password: pel2899100 ]

note:  If you have never SSH'd into the unit before, you will likely receive the following warning prompt...

...simply click Yes to proceed.

note: If you receive the following error...

Visit http://www.putty.org/ to download and then copy putty.exe into your workstation c:\windows\system32 folder.

note: If SSH fails to connect in any other fashion, the NSM5200 may not be able to fully boot up, and you will need to connect a VGA Monitor and PC Keyboard directly at the local NSM5200 console in order to proceed. 
 
3. Stop all Services and unmount all data volumes:
service pald stop
service pal3d stop
service nsxd stop
umount /data/*
Note:  If the umount /data/* command fails to unmount /data/local_0 or /data/local_1, then use the following command:
service mbrd stop
 
 
4. If the NSM5200 in question has...
a. 4TB Hard Disk Drives, skip steps 4 and 5, and proceed to "Section 2: How to repair Data Volumes individually".
or...
b. 3TB or smaller drives, continue by executing the xfs_repair_array command as seen here... 
note: If this fails you will need to add clearlog to the end of the command xfs_repair_array clearlog, if it still fails, skip step 5 and proceed to "Section 2: How to repair Data Volumes individually".
 
5. Once the repair has finished, issue the reboot command.

 
6. After the final reboot, use the df command to ensure the Video Array Volumes are mounted (/dev/sdb1 @ /data/local_0 and /dev/sdb8 @ /data/local_1), as shown below.


 
Section 2: How to repair Data Volumes individually
note: Before proceeding, ensure you've completed step 3 above in the previous section.
note: This section should be completely reviewed prior to issueing commands; Contact Pelco Product Support for assistance if needed.

1. First determine which volumes are present for repair. 
To repair individual volumes, the nr_xfs_repair command must be invoked, and both the data and the log volumes must be specified in the command switch options.

For NSM5200 units with 3TeraByte or smaller hard disk drives, there will be 2 data volumes, each with their own log volume. The 1st is /dev/sdb1 with the log volume /dev/sdb2, and so the command to repair this would be...

nr_xfs_repair  /dev/sdb1  -l  /dev/sdb2

The 2nd is /dev/sdb8 with the log volume of /dev/sdb7...

nr_xfs_repair  /dev/sdb8  -l  /dev/sdb7

For NSM5200 units with 4TB HDDs, there may be 4 data volumes instead of 2, each with their own log volume. To check for this, enter the cat /proc/partitions command and examine the output.

The following screenshot illustrates how this typically looks, and gives the repair commands for each volume pair...

 

2. Now that we know which volumes are available for repair, issue the appropriate nr_xfs_repair commands as outlined in step 1 above. There may be errors received, choose the appropriate course of action as outlined below...

a. If the following error is received...
nr_xfs_repair: prefetch.c:432: pf_batch_read: Assertion `((xfs_daddr_t)(((xfs_fsblock_t)(((xfs_agnumber_t)((fsbno) >> (mp)->m_sb.sb_agblklog))) * (mp)->m_sb.sb_agblocks + (((xfs_agblock_t)((fsbno) & xfs_mask32lo((mp)->m_sb.sb_agblklog))))) << (mp)->m_blkbb_log)) == ((bplist[0])->b_blkno)' failed.

...add the -P option to the comand switch options, for example...

nr_xfs_repair  -P  /dev/sdb1  -l  /dev/sdb2

b. If the following error is received...
Error: the filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the –L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption – please attempt a mount of the filesystem before doing this.

...attempt to mount and then unmount the data volume in question, then re-attempt nr_xfs_repair, for example...

mount  -t  xfs  -o  logbufs=4,osyncisdsync,logdev=/dev/sdb2  /dev/sdb1  /data/local_0
umount /data/*
nr_xfs_repair /dev/sdb1 -l /dev/sdb2

...if the nr_xfs_repair attempt still fails with the same error, or the data volume will not mount, add the -L option to the command switch options, for example...

nr_xfs_repair  -L  /dev/sdb1  -l  /dev/sdb2

note: Using the -L option may result in video loss.

3. After repairs have finished, issue the reboot command and wait for all services to finish startup (6 to 15 minutes) to resume normal operation.
note: If any other issues arise, or these commands do not work for you, contact Pelco Product Support @ 1 (800) 289-9100 for assistance.