These days, it is difficult to determine whether a cloud has actually gone down. There might be a brief outage, but caching and other systems kick in, and it is largely invisible. If your cloud-resident application is available and working for 90% of your audience, but not the other 10, is the cloud up or down? Is there an in between?
Sometimes, it is a question of marketing and the technicalities of service-level agreements. Other times, it is a question for the media. Take Microsoft Azure’s recent outage. There was a problem with a management feature in the compute section of the Microsoft public cloud. For about 24 hours, it was chaos, and for the IT departments affected, the reverberations continued like after shocks for a while after.
As a guy who has spent substantially my entire career in the cloud, the first thing that goes through my mind when I see the news like this is, “I’m glad it wasn’t me.” Then, I ask myself, “What can customers can do to protect themselves from outages like this?”
Architecturally, I know to recommend fully redundant systems, possibly even from different hardware or cloud vendors. As a pragmatist, it is painfully obvious that few customers have the budgets, or the patience, to implement a system like that when the world is pulling them to a single cloud provider, which itself is likely dependent on specific hardware vendors.
Since the days when IT departments reported to the chief financial officer, it has always been a question of economics versus risk. Today, it is totally possible to have a backup at a different public cloud provider, even with instances on standby. You can even keep redundant copies of your data or applications on different infrastructure as a service (IaaS) providers, or failover from one platform as a service (PaaS) to another.
There’s just one little problem with that. It is quite expensive to do it.
Twenty years ago, the discussion was around whether you should quadruple your budget in order to mirror your servers. Today, it is about doing the same thing to mirror your clouds. The situation hasn’t changed much – you only decide to do it when the system is mission-critical and the data is valuable. Often, since the reason for the cloud in the first place is reduced operational costs, it is a great business decision to rely on a single provider and their investment in their own redundancies.
The problem of multiple clouds is more difficult than simple server mirroring, however. The reason is that different cloud providers have different policies, procedures, and management systems. That means that maintaining security across different redundant cloud providers with many of the traditional security offerings that grew up in the ‘old world’ data center is likely to be quite painful.
The solution to this is to separate the management of your security systems from the security system themselves, so that you can manage your security across different clouds using a single set of policies. The good news is there are security companies that have made this a priority, including Trend Micro with its Deep Security platform, and customers can achieve the operational requirements of redundancy without compromising on security.