Still no decent explanation on BA outage.
Good to hear the board have requested an external inquiry.
http://www.bbc.co.uk/news/technology-40118386
The explanation given leaves more questions unanswered imo. New and old critical servers usually have dual power supplies. Idea being each power supply is sourced from two different electrical circuits. So if one circuit fails the second continues.
There are then clustered nodes and high availability for failing servers over from one to another. The level of redundancies are usually layered depending on how critical systems are. Servers have UPS which can support anything from 10m to couple of hours or more. Most power failures are spikes or brief drops. In the event of a major power outage, UPS will maintain systems, flashing errors allowing sufficient time for orderly shutdown or controlled failover to DR system / site.
Moreover, these UPS and latest servers are more resilient to power surges and drops as they have voltage regulation and tolerances built in. In my experience I've never heard of power surge taking out a whole data centre.
They refer to uncontrolled return of power. This doesn't add up either. Most wiring now have circuit trips at various intervals. Even offices have these and you might find one bank of desks work and others don't because return of power throws switches that trip the circuit preventing damage. Then there are fuses and circuit breakers.
When powering up a whole building often power goes up in stages by floor at a time adn even then some circuits can still get tripped. There are so many protection layers I've never encountered once again a whole data being taken out by return of power.
So I'm at a loss as to how an Enterprise class data centre power cut and restoration can take out the whole data centre IT kit??? Even in third world countries when watching a World Cup match on TV you'll find they have power regulators and small ups for their TVs.
BA cost cutting sounds very severe if they removed all these single points of failure providing redundancy & safety mechanism relying solely on data centre power.
I'm not happy. As a customer I'd like to see that detailed report and what they are going to be doing about it as I certainly will not be flying with them again unless full explanation given and resolute action to address issues are taken.