Cloud computing is definitely starting to grow up. There is already a very competitive market, and vendors are continually refining both the technology and business aspects of their offering. Most vendors are trying to convince their customers that everything must move to the cloud, and that the local datacenter is either dead, or relegated to managing local print servers and bandwidth.
I agree that there is a lot of value and benefit for companies to start leveraging the cloud as part of their strategic planning, but we are a long way from just assuming everything will live in the cloud. There is still a lot of value in not only running on-premises data centers, but also in running local software. The cloud merely represents a new option in how the IT organization optimizes in delivering value, security, and stability to the business. The cloud is a new channel for deploying and managing software assets, but the other classic channels should continue to be leveraged. The IT organization should decide what exactly the cloud platform brings to them for value, and look across their assets to which would benefit the most from those advantages.
There are three key scenarios for when moving to the cloud makes sense, at least in today’s environment. The environment business and IT is operating in today is dominated by both a harsh economy, and very competitive marketplace. “Doing more with less” has never been more important than today. Each of these three scenarios has an element of doing more with less, usually by maintaining a capability while reducing costs.
Capital Crunch
The first clear scenario where leveraging the cloud makes sense is in a startup company. A startup company is usually focused like a laser on launching their new product or service. They are usually run on a shoe string budget, with limited access to capital for building out a sophisticated data center.
With this limited capital, the startup should focus the use of that money on delivering the most value with their software, adding features, and making the experience really shine. This experience is key to their strategy and success in the market. The best way to do this is to find the best talent they can find.
Building out a complex data center is usually not core to their strategic direction. Their service must be reliable, and stable, but they don’t require complete and direct control over the infrastructure. The startup is in the business of their idea, not in managing a data center. Since the cloud is based on the utility model, the startups cost will scale as their service becomes more popular, and hopefully in step with a growth in revenues. The cloud is also able to provide a more sophisticated data center than the startup would be able to provide. This would include disaster recovery planning, security, clustering, load balancing, and geographic distribution.
By leveraging the cloud, they not only shift the burden of building and managing the hardware of the datacenter, they are also able to leverage the expertise of the cloud operator. This leaves the startup to focus on their core skills, including development, marketing, and sales. The cloud operator provides the skilled workforce needed to build and maintain the datacenter.
A startup hopes to become an overnight success, and this is possible with the Internet. But if the startup has built out a modest datacenter with their limited capital, and then becomes an overnight success they risk alienating all of those new users, because their datacenter will not be able to support the sudden rush of users. Most cloud platforms make it easy to scale up with demand. As the startup sees the rush coming, they can scale up their cloud infrastructure, ensuring that the new flood of users have a great experience. This isn’t just about webservers and bandwidth. The startup also needs to scale up the data storage and disk storage capabilities. A sudden flood of users could quickly consume their current SAN infrastructure. Not only does the startup need elastic computing, but they usually also need elastic storage.
Scaling Up to Control A Floodgate
The next common scenario can easily be found in many larger enterprises. In this situation the company has some business process that requires high computational needs, but with little traffic demand. This scenario would usually be solved with scaling your systems up, providing more compute power to each node. This could be in a scientific role, where the compute time is used to simulate the folding of proteins, in order to find the materials that should be tested in a real lab. An insurance company may receive very large files that need to be processed from large clients during open enrollment. Both of these examples are seen as a floodgate event, where there is a sudden spike in the processing needs of the business.
Traditionally the business would either have to scale up their hardware over time so they can respond to these floodgate events, or reduce their ability to respond in a timely manner. Overbuilding infrastructure for the spike is a waste of capital. The wasted capital can be freed up to be allocated to something that will have a strategic impact to the business.
By moving just that business process to the cloud, the company can respond with agility when a flood appears. In the first example, the science team can upload their test data, and schedule a turn up on the amount of computing power allocated to them based on the needs of the test being run. The insurance company can compose their system with services in the cloud and applications running on-premises. The cloud services can be developed to dynamically adjust the allocated compute power by instrumenting their code, and tying into the management APIs the cloud provides. With this approach, the change in how the inbound files are processed is transparent to the rest of the system, and the customer. The use of an Internet Service Bus would make it easy to tie these elements together, allowing IT to compose a sophisticated system, with some elements on-premises, and some in the cloud. In this example, the elements that require elastic compute would be abstracted away into the cloud, with the rest being hosted in a more traditional manner.
This architecture would improve the responsiveness of the system, and dramatically reduce the capital expense of building out an extensive data center for dramatic spikes in demand. Because the cloud is paid for like a utility, on a consumption model, it becomes easier for IT to calculate the direct cost of processing each job or customer’s file, leading to a better financial management model for the business.
Scaling Out to Manage Spiky Traffic
The third scenario is more common than you think. Many businesses have a dramatic cycle to their business with a wide difference in the magnitude between spikes and valleys in the load on their systems. Sometimes this cycle is seasonal in nature, or it reflects an aspect of the industry that the business is in. This situation is usually referred to as low compute-high traffic, and is usually solved by scaling the system out horizontally, providing additional low powered servers to satisfy the load.
Consider an ecommerce site that specializes in consumer electronics. Through the majority of the year their site traffic might demand that they have a two by two physical architecture. A simple ‘two web servers with two database servers’ platform will meet most of their needs. During the holidays their traffic might triple, requiring more hardware. The business cannot afford to not scale up their hardware because the online retail industry is hyper-competitive. When you go to a store the day after Thanksgiving to buy a DVD player for $4, you are willing to stand in line knowing that you will spend just as much time traveling to the next store, just to find yet another line. But when you visit an ecommerce website and it is a bit slow loading the pages, or sluggish in the checkout process, another retailer that is offering the same product at a similar price is just a click away.
The online retailer cannot risk a situation like this, as they will lose a lot of their potential business, but they can’t afford to triple their hardware just for a few short months. The site needs to deal with broad changes in the amount of traffic it needs to serve. If the retailer moves their web site to the cloud, they can easily scale the environment up and down, as needed. During spring, when their business is usually slower they can scale their system down, and when the holidays come around, they can dramatically scale up. Leveraging the cloud in this manner also reduces the risk of receiving the hardware at the right time, preparing it, and integrating it into the sophisticated load balancing and clustered environment.
Year end spikes are not the only traffic shape you should look for. Perhaps an auction site has a sudden surge in the late evening hours every weekend, or a book retailer is expecting a surge because of the release of the next big book. These spikes are usually predictable, based on the nature of the business, but not always regular. These spikes are usually based on traffic, not on the amount of compute power needed.
A transitional approach might start with a system running on-premises with enough capacity for a normal day, with the capability to shunt excess load to the instance in the cloud. As the on-premises hardware ages, and is due for a refresh, IT should have a good enough handle on the management aspects of the cloud, and the costs, that they can at that time shift the entire load into the cloud.
These are only a few scenarios that could make sense to move into the cloud. As time moves on, and as the cloud players become mature, I expect more and more scenarios to make sense. A forward looking architect should start exploring these cases now, so that their business is in a position to take advantage of them when the time is right.
Each business must make decisions around when and what to move to the cloud. The company should focus on reducing infrastructure costs, minimizing the affects of demand variability on their business, and shifting core, yet non-strategic capabilities off their shoulders. The easiest place to start is either where there isn’t a lot of strategic value to directly managing the system, such as email or CRM. Make no mistake, email and CRM are core to almost any business, but how they are operated is usually not a strategic differentiator; they are simply a needed capability, like phones and power. The systems that fall into this category usually do not represent a high risk scenario. Because the cloud is pay as you go, it is very cheap to try a pilot, sometimes with such a low budget that it falls within the discretionary budget approval of a manager or director.
Identifying which of these scenarios exist in an organization is an important role of any architect. The architect must always be on the lookout for new technologies or capabilities that will lead to either a strategic advantage to the business, or a reduction in costs.
written by Brian Prince
Brian H. Prince is an Architect Evangelist with Microsoft focused on building and educating the architect community in his district. Prior to joining Microsoft in March 2008, he was a Senior Director, Technology Strategy for a major mid-west partner.
Brian holds a Bachelor of Arts degree in Computer Science and Physics from Capital University, Columbus, Ohio. He is also an avid gamer.




