forums

Complete - Hazard Notice - NSP Emergency Change - Tuesday 22nd March 2016 1700-1900

Due to a software bug NSP support will be performing an emergency change in QH2 Zone.


#71029713 Xenserver 6.5 Qlogic LVMoHBA Kernel Panic


Users may have experienced interruption to their services on 03/17/2016 and 03/06/2016. 


The interruption of service was caused by a software bug in the Qlogic HBA kernel module and is triggered when there is a change to the SAN topology or dual redundant paths to the storage. 


Emergency Change Date: 22/03/2016 5PM - 7PM (NSP Maintenance Window)

Description: Emergency Change to Update Hypervisor HBA Drivers and Kernel in the QH2 Zone

On March 17th at 6.18PM in QH2 Zone, there was a change in the Hitachi Storage causing one of the dual redundant paths to become unavailable. 

This resulted in qh2-nsp05.nsp.nectar.org.au to kernel panic and hard reboot as a result as a Qlogic fiber channel HBA kernel module driver bug that could not handle a multipath failover correctly.


On March 06 at 5.56PM in the QH2 Zone, there was a change in the Hitachi Storage causing one of the dual redundant paths to become unavailable.

This resulted in qh2-nsp01.nsp.nectar.org.au to kernel panic and hard reboot as a result as a Qlogic fiber channel HBA kernel module driver bug that could not handle a multipath failover correctly.


Driver Disk for QLogic qla2xxx-8.07.00.29.66.5_k - For XenServer 6.5.0

http://support.citrix.com/article/CTX205167

Current driver in use is outdated - the original Xenserver 6.5 Kernel Module - version: 8.07.00.09.66.5-k


Impact: This change will be non-impacting, although 1% of instances running in the NSP QH2 Zone may fail to live migrate

Issues Resolved:

XS65ESP1021:

- QLogic adapter ports can reset intermittently during normal operation resulting in packet loss or switchover. The issue is specific to QLogic cards.

- Removes a race condition in the control domain (dom0) kernel.

- When migrating, suspending or snapshotting VMs, a soft lockup in control domain crashes the host. In rare circumstances, mapping a VM's memory can take sufficiently long that it triggers the soft lockup detector in the VM's kernel.

- XenServer host crashes after assigning GPU passthrough to a Linux VM due to some fatal errors reported by PCIe devices.


Comments to this discussion are now closed!