[Unscheduled outage, completed] NeCTAR Volume Storage outage in QRIScloud AZ
over 8 years ago
by Stephen Crawley
Topic is Locked
Stephen CrawleyAdmin
The Ceph cluster (QLD-Ceph which stores the NeCTAR Volume Storage
data) has sustained a hardware failure. Due to insufficient storage
capacity to cope with a node failure, QLD-Ceph is now running in
degraded mode and is likely to go into read-only mode very soon. When this happens:
Instances that are booted from a volume in in the QRIScloud AZ will lock up.
Instances that have a volume attached and mounted will experience hard file system errors.
If you
have critical data on your volume that you have not backed up, you
should do this NOW.
If you no longer need your QRIScloud NeCTAR volume, could they please contact QRIScloud Support (07 3346 4202) urgently.
For more information and updates, please refer to the QRIScloud announcement: https://qriscloud.zendesk.com/hc/en-us/articles/207711663
UPDATE - The outage is now happening. All affected NeCTAR instances are currently paused and will be un-paused when it is is safe to do so. Instance owners have been contacted individually.
FINAL UPDATE - The outage is now over. Please refer to the link above for the full history.
0 Votes
People who like this
Delete Comment
This post will be deleted permanently. Are you sure?
The Ceph cluster (QLD-Ceph which stores the NeCTAR Volume Storage data) has sustained a hardware failure. Due to insufficient storage capacity to cope with a node failure, QLD-Ceph is now running in degraded mode and is likely to go into read-only mode very soon. When this happens:
If you have critical data on your volume that you have not backed up, you should do this NOW.
If you no longer need your QRIScloud NeCTAR volume, could they please contact QRIScloud Support (07 3346 4202) urgently.
For more information and updates, please refer to the QRIScloud announcement: https://qriscloud.zendesk.com/hc/en-us/articles/207711663
UPDATE - The outage is now happening. All affected NeCTAR instances are currently paused and will be un-paused when it is is safe to do so. Instance owners have been contacted individually.
FINAL UPDATE - The outage is now over. Please refer to the link above for the full history.
0 Votes