Rackspace outage was third in two days
This update sent to us by a customer of Rackspace, the Texas Web host which temporarily lost a datacenter when a truck collided with a nearby power transformer. Posted on my.rackspace.com, the company's customer support website, it reveals this was the third power issue in two days: one Sunday morning, one Monday morning and another this evening. Not all customers were affected by all three outages, but three outages at one hosting facility is not good.
Monday Nov. 12th 8:00PM CST — Thank you for your patience today as we work through a root cause analysis of the power issues in the DFW data center. The Data Center Engineering team is continuing to work on the plan to move back to utility power without any further interruption in service. Rackspace will notify you in advance before we move to switch back to utility power.
In the meantime, here is a brief timeline of events:
Sunday, Nov. 11, 2007 4:19 a.m. CST — A problem in the internal utility power distribution grid caused an outage to cabinets in one section of the DFW data center. 6:49 a.m. CST — Power was fully transferred to generator power. Based on building monitoring systems, outage times varied for every customer. DC engineering worked to isolate the internal utility problem and restore the integrity of the internal distribution system. 6:32 p.m. CST — A separate incident occurred when a breaker in the generator power grid tripped, causing one of the Power Distribution Units (PDUs) in the same section of the DFW Datacenter to fail, affecting a much smaller group of the customers in this section. All customer devices with dual power supplies in this section of the datacenter remained online and were not affected. Customer devices with single power supplies in this area were affected. Data Center technicians immediately acted to minimize the impact on these customers by moving these devices manually to alternate power supplies - resulting in just a few minutes of downtime. 7:40 p.m. CST — The breaker problem was diagnosed and resolved, bringing the down PDU back online.
Monday, Nov. 12, 2007
4:00 a.m. CST — The Data Center engineering team had the initial utility distribution grid realigned and resynchronized. All systems reported ready for operation.
4:30 a.m. CST — Transfer of power was initiated and affected devices were slowly moved off of generator power and back to internal utility distribution power.
5:10 a.m. CST — Transfer of power was completed.
5:25 a.m. CST — Unfortunately, the internal distribution grid failed again. Data Center engineering was able to transfer all affected devices back to generator power in under 15 minutes.
5:40 a.m. CST — All affected devices were back on generator power. The Data Center environment is stable and is designed to be able to run indefinitely on generator power. Data Center engineering is continuing to diagnose the problem and engaging all vendors onsite.We will continue to provide updates via the MyRackspace portal. In addition, you will receive notification of any maintenance windows. We are committed to supporting your business and minimizing any impact on your hosted environment.
Here is a another report that was sent to us regarding this evening's problems.
DFW Datacenter Power Update
Nov. 12th 8:30PM CST
In a completely unrelated incident to this weekend's power problems in DFW, a traffic accident caused damage to a power transformer which provides utility power to our DFW data center. Here is the current sequence of events:
* At approximately 6:00 p.m. CST utility power was lost to the DFW data center
* Power automatically switched over to backup generators without disrupting service for any customers
* When generator power was established two chillers within the data center failed to start back up
* Utility power was re-established through a secondary utility source
* As a result of temporary data center temperature increases, we proactively shutdown a number of customer servers to protect them from overheatingAt this point, the chillers are back up and running and we are operating on generator power throughout the data center. We have contractors on site to repair the damage and will be in contact with all customers who have been affected by this outage. We apologize for any disruption to your business operations and will work diligently to restore your service.
Previously:
Rackspace was "most reliable" webhost in September 2007
Truck driver in Texas kills all the websites you really use
Rackspace outage affects Texas ISP