The cloud is all the hype now, but most people don’t keep in mind that it is not a silver bullet or a panacea. The cloud (or more technically speaking), the use of large scale virtualization is recommended for specific use cases. In this article, we’re going to go through these use cases for which the cloud is recommended. In the final chapter, I am going to emphasize some of the caveats, traps and risks associated with transitioning to the cloud.
Whether we realize it or not, we are already users of cloud services, every time we sign into Gmail, Picassa, Facebook, Linkedin, Office365 or DropBox. This is the most frequent and wide-spread use case for the cloud. For the average consumer, freelancer, small and medium company it does not make sense to invest in and directly operate services such as:
- Email (Gmail for Business, Office 365)
- File Storage (Google Drive, DropBox)
- Document Management (Gmail for Business, Confluence, Office 365)
The providers of such services afford to invest much more knowledge, effort into operating a reliable infrastructure – because of their scale. In other words, if developing service X (let’s say email) would cost $10 million, it would not make sense for a company with 20 employees, but it would make sense for a service provider which has 100,000 such companies as its customers. The advantage of using such cloud services revolve around:
- No up-front cost (like the cost of owning hardware)
- Predictable operating cost (for instance, Google drive sells 1TB for 120$/year)
- Less downtime and better quality of the service
- Better data reliability (as big service providers afford to store the same data in 3 or more places at any one time)
However, one also has to be aware of the risks and down-sides of such services:
- No Internet connection can mean no access to data or to the service. Although this risk can be partially mitigated by having local snapshots of the data, it is worth considering. Nevertheless, in today’s connected business, no Internet often means no business
- No physical ownership of the data. From an operating perspective this is rarely a problem, but this is worth considering from a legal and business continuity perspective.
- Potentially slower access to the data, especially for large files (studio size images, videos), as the Internet is still considerably slower than a local network.
To facilitate the transition from an on-site service to a cloud service, it is always a good idea to do a pilot program with a small team or for a smaller project, so as to have the change to smooth out any bumps in the road with minimal business impact.
You might get an average traffic 1 million page views/hour most of the year, but you might spike somewhere between 10 and 15 view/hour during 5 or 10 days of the year. This is especially common for e-commerce sites around the holidays (winter, spring). The non-cloud solutions would entail either over-scaling your infrastructure just to cope with those 5-10 days of traffic spikes or settling for unsatisfactory performance during the most profitable time of the year. A cloud solution would allow you to grow or shrink your infrastructure depending on these need with one-hour step. Let say you can handle the “normal” traffic with 4 web-servers and 4 application-servers. Simple math would dictate you would need somewhere between 40-60 web servers and 40-60 application servers to handle the peak load (the exact number would depend on application type, your business process, average machine load during “normal traffic). If you were to take this infrastructure on-site, it means that 90% of your cost would be waste during the 355 days of the year when you don’t need the extra-juice. What a cloud provider does is it allows you to only activate and pay for this extra-infrastructure when you need it. In the case were spikes are expected to have a yearly seasonality, it is reasonable to rely on cloud services only during that time of the year. However, many services – such as content streaming – may have a daily seasonality (high traffic during the evenings). The cases where the exhibited traffic seasonality is finer-grained (daily) may require moving the entire web serving solution into the cloud.
Once-in-a-while High Volume Data Crunching
For irregular, high volume workloads, the cloud is also your friend. If every three months, you need to crunch 100TB of data in one big scoop (in 12-48 hour), there’s no reason for you to keep and pay for the required infrastructure for the entire three months. This use case is common in research, where workloads are not periodic and tend to be intensive. Another use case would be reporting. However, you need to keep in mind that large volume reporting workloads (quarterly, yearly reports) can be split into smaller workloads (hourly/daily/weekly roll-ups), which can then be joined/summed up together fairly quickly even on a smaller infrastructure. This way, the effort is split up over time and the quarterly/yearly spikes at the reporting date can be a lot smaller. Top three vendors:
- Amazon Web Service
- Microsoft Azure
- Google Cloud Platform
Static Assets, Geographic Spread and Content Delivery Networks
It just might be the case that an US-based company has a lot of users from Europe or from Asia. In this case, it is highly advisable to use a Content Delivery Network (CDN). A CDN customarily delivers to end-users the static assets from your site (stuff that rarely changes and that is the same regardless of the user): images, CSS files, JS files, videos. The CDN is basically a network of severs (called nodes or edges) around the world which copies the static content from your website (even if your website is NOT cloud-supported) and then serve this static content several times to different users in its geographic vicinity. This achieves several advantages for your website:
- Offloading your main servers: your main server(s) only have to serve the content a few times (once for each CDN node/edge), while that content would then be served thousands of times to the end users.
- Closer means faster: on average, the edge will be geographically closer to the end-user, which means that the round-trip of the data packets will be faster, which means your site will load faster.
- Many is better: most CDNs will allow different requests (files) belonging to the same page from different domains, which means that end-users browser will be able to request more stuff at the same time, which is yet another source of speed up.
It is important to understand that your web servers do not need to be virtualized or cloud-hosted in order for you to take advantage of the advantages of CDN. You can very well setup a CDN over your on-site hosted website. If you have a small website/blog without too many requirements around it, you can very well try a CDN such as CloudFlare for free. They have a tutorial around it (a bit technical, but you’ll live).
Backup and Disaster Recovery
This subject is a touchy one. Cloud storage services usually offer a good-to-great level of reliability/durability for storage (think many 9s after the decimal separator). Amazon S3 promises 99.999999999% (that’s nine nines after the decimal separator). This is far better than what you could possibly achieve on-site, so it makes cloud services an ideal candidate for backing up your data. However, entrusting your data (and customer data, including personally identifying information, credit cards and so on) off-premises may be perceived as too high of a risk or may prove not compliant with security standards for which your company is certified ( such as PCI DSS, ISO/IEC 27001 ). One mitigation for such risk is to encrypt data with a private key before transmitting it to the cloud service. However, this raises the issue of securely and reliably storing the keys (in at least two geographic locations, in order to achieve disaster recovery capability). Thus, in the light version of this use case, one can use cloud storage services for periodically backing up data in a reliable/durable way, preferably with a layer of added encryption. However, in case of disaster, retrieving the data from cloud storage and resuming operations can take days or even weeks, which may prove unacceptable for business continuity. An use case more suitable for more mature companies is to have an up-to-date replicate of their entire infrastructure ready for deployment (but not deployed) in the cloud. This cloud infrastructure could be activated in case of disaster so as to handle the load while the on-site infrastructure is being reinstated, bringing customer facing downtime from days to hours. However, this scenario requires guaranteeing data freshness by more frequent snapshots and several complex disaster recovery test to ensure the cloud infrastructure would be deployed and function as expected.
As no analysis is complete without also emphasize the reasons against a certain solution, one should be careful in considering the following points when planning using or moving to the cloud:
- Cost. At a large scale (hundreds of machines), having an on-site infrastructure and/or a private cloud (on-site virtualization solution) can become more efficient than renting cloud services. However, most companies don’t reach such a scale. Nevertheless, one should always keep a close eye on cost, as the freedom of expanding and shrinking afforded by cloud providers comes with a price. Also, consider that considerable reductions in cost can be achieved by using reservation plans with cloud providers: making a commitment for a certain usage over 1-3 years in exchange for a reduction is cost.
- Data Ephemerality. Unlike physical machines, virtual machines get terminated (read “disappear”) all the time. By default, if they are not configured to use persistent storage (such as Amazon’s Elastic Block Storage or Simple Storage Service), the data on them disappears with them. Make sure your persistent data is actually always stored on persistent media!
- Operational practices. Given the fact that the cloud encourages the use of several (often smaller)virtual machines as opposed to a few ones, it quickly becomes unpractical for operations team to manually configure each machine. That is operations need to focus on automation processes (automatically deploying and configuring machines) and shift away from the “log on to the machine and configure it” view.
- Security. Although cloud providers offers a great array of tools for managing security (firewall, dynamic keys), most system administrators may not be familiarized with these tools and best practices. Make sure your team(s) has a good understanding before moving business-critical data and apps in the cloud.
- Performance/Cost. The virtualization associated with cloud computing comes with a performance penalty, the degree of which may vary greatly depending on the type of application you’re using. That isn’t to say you cannot get the same performance from a cloud machine that you can get from a physical machine (dedicated, bare-metal hosting). It just means that you may end up paying more for it. In other words, you might end up getting less bang for your buck. Be sure to benchmark performance on several instance types and make at least a high-level cost projection. Otherwise, you might end up unpleasantly surprising your CFO.
- Legal/regulatory requirements. Several companies (usually enterprise/corporate size) come under legal and regulatory requirements of not sharing customer data with third parties or of not storing/transmitting customer data outside of the country. Be sure to triple-check with said legal and regulatory requirements or to find a technical solution which does not store/transmit customer data to the cloud provider (for instance, using a CDN for serving static assets from sessionless domains would be a good solution from separating cloud-delivered content from in-house stored customer data).
The key take-away from this post is that cloud services provide a vast array of tools to help in today’s business and technology environment, without however being a one-size-fits-all solution. Either starting up on or transitioning to the cloud requires careful planning and an in-depth understanding of the processes to be implemented as well as of the technology landscape.
As always, any questions, comments and feedback are more than welcome. I’m also open to discussing specific use cases and integration scenarios.