Anyone receiving the latest industry news will have noticed a familiar theme emerging. Data centre power failures have taken down networks and businesses all over the country. In one case the data centre infrastructure was to blame and all power was lost. However, in others, redundant power feeds stayed up but were either unable to cope with the load transfer or the data centre client hadn’t used the diverse feeds correctly. In a couple of cases, these data centre clients were some of the largest network providers in the country.
To take the positive from this, it does provide a focus and an opportunity to make sure it doesn’t happen to you or your organisation.
How do you ensure that you’re not among these unlucky or unfortunate organisations?
Well perhaps a series of basic questions could give you a ‘heads up’ as to the chances of you being among them in the near future.
o Are you using your dual power correctly?
o If one feed fails, is the other feed large enough (so you won’t trip your supply)?
o Has your facilities provider left enough ‘spare’ capacity on your second feed for a failover? Are they both coming from the same UPS? Both important questions to know the answer to.
o Is your emergency out-of-bands access diversely powered as well?
o Are your comms providers at the same location also using dual feeds properly, so that in the event of a failure they automatically switch to a second separate power feed that won’t overload? Make sure you know as surely there is no point in your equipment being on if you have no connectivity.
These might seem very simple, and quite possibly obvious, but it is surprising how many companies have never checked these basic set up conditions prior to investing their business with particular suppliers or facilities.
o When did you last test it?
o Schedule an annual single feed power failure test during a maintenance window.
o Retest after significant infrastructure changes or upgrades.
o Check regularly with your facility that they are able to provide all your racks with full power on either supply.
Again this might seem to be a minimal regime, but every little bit more that you can do to make sure you are fully redundant helps. It is also important to check that they are carried out on a regular and recorded basis.
o Dual site resilience – Do you have a ‘B’ site?
o Total failures of a single site can still happen, is your ‘B’ site ready to take over?
o Schedule and test your failover, how long does it take and can you do it without anything being available at the primary site?
One of the prerequisites of disaster recovery planning is to try to prevent the disaster in the first place. Having a second separate site offers that assurance in case of a primary site going down.
o Do you have provider upstream diversity?
o Go deep with your providers – query where they get their upstream services from.
o Check to ensure that your ‘diverse’ providers are indeed diverse and are not using a common upstream service/facility/interconnect etc.
If you receive satisfactory answers to all of these questions you will have gone a long way towards ensuring that you and your business do not join the list of disappointed and frustrated clients who thought they had covered all their risks by engaging with some big brand facility and service providers.