forums

[Unscheduled Outage] Fire alarm triggered, Site evacuated at Noble Park DC, melbourne-np AZ

We just received a notification from Fujitsu who manages the Noble Park Datacentre hosting the melbourne-np availability zone:


https://emergency.vic.gov.au/respond/#!/incident/1692967


Fire alarm triggered, Site evacuated
Facility: Noble Park
Severity: Sev 2
System Impacted: Fire Systems/VESDA
Client Impact:
Next Update Due: 24-05-2018 19:30


Many Openstack compute nodes are currently down. Likely if you have VMs running in the melbourne-np AZ, you would experience outage. 


We are still checking up on the extent of disruption and damages. We will communicate if there is further update coming from the Datacentre. 


Linh Vu (Melbourne Node) 

Access to site restored. Data hall 2 Investigation underway.

Facility: Noble Park

Severity: Sev 2

System Impacted: Fire Systems/VESDA

Client Impact: 

Next Update Due: 24-05-2018 18:50

Fault isolated to an STS within DH2 and as a precaution, the STS and corresponding PDU has been powered off awaiting further investigation by Schneider. Rack power redundancy reduced to associated racks.

Facility: Noble Park

Severity: Sev 2

System Impacted: Fire Systems/VESDA

Client Impact: 

Next Update Due: 24-05-2018 19:20

Redundant power returned to the affected racks via STS bypass mode. Still awaiting scheduling of the replacement component

Facility: Noble Park

Severity: Sev 2

System Impacted: Fire Systems/VESDA

Client Impact: 

Next Update Due: 24-05-2018 23:30

Replacement STS components scheduled to arrive at Noble Park Data Centre at approximately 10:30.
Facility: Noble Park
Severity: Sev 2
System Impacted: Fire Systems/VESDA
Client Impact: 
Next Update Due: 25-05-2018 11:00

Replacement STS components and technician are on site and non-invasive preparation works have commenced.

Facility: Noble Park

Severity: Sev 2

System Impacted: Fire Systems/VESDA

Client Impact: 

Next Update Due: 25-05-2018 12:00

Power services have been restored to the Noble Park Data Centre and most of the research cloud services have returned to operation.  However, some equipment failed to recover, and 5 compute hosts remain down.  Virtual machines provisioned on those hosts are not available at this time. We will be contacting affected users directly.  Several additional nodes may still be experiencing connectivity issues due to an impacted network switch.

 

During the outage, many VMs lost access to volume storage, however these connections should have been restored when the data services returned.  We are investigating if any problems are persisting with data connectivity.  We appreciate your patience and understanding as we return our services to normal operation.

 

Regards

 

Bernard Meade

Research Computing Services

Login to post a comment