Stanford engineers create software tool to cut cloud computing costs


Just like Netflix uses an algorithm to recommend which movies we should see, a Stanford software system offers instant advice to thousands of server farm computers on how to effectively share the workload.


Norbert von der Groeben

Stanford engineers Christina Delimitrou and Christos Kozyrakis have created a software tool that can triple the efficiency of computer server clusters.

We hear a lot about the future of cloud computing, but not a lot about the efficiency of the data centers that make the cloud possible. In these facilities, clusters of server computers work together to host applications ranging from social networking to big data analytics.

Data centers cost millions of dollars to build and operate, and the purchase of servers is the single largest expense that centers face. Yet at any given time, most servers in a typical data center are only using 20% ​​of their capacity.

Why? Because the workload can vary greatly depending on factors such as the time of day, the number of logged in users, or sudden and unexpected demand. Having excess capacity is the usual way to solve this peak demand problem.

But as cloud computing expands, the cost of maintaining such a large capacity cushion will also increase. That’s why two Stanford engineers created a cluster management tool that can triple server efficiency while providing reliable service at all times, enabling data center operators to serve more customers for every dollar invested. .

Christos Kozyrakis, associate professor of electrical engineering and computer science, and Christine Delimitrou, a doctoral student in electrical engineering, will explain their cluster management system, called Quasar, when scientists who design and manage data centers meet for a conference in Salt Lake City, starting March 1.

“This is a proof of concept for an approach that could change the way we manage server clusters,” said Jason Mars, professor of computer science at the University of Michigan at Ann Arbor.

Kushagra Vaid, general manager of cloud server engineering at Microsoft Corp., said larger data center operators have come up with ways to manage their operations, but many small organizations haven’t. do.

“If you can double the amount of work you do with the same server footprint, it would give you the agility to grow your business quickly,” said Vaid, who oversees a global operation with over a million servers. for over a billion. users.

How Quasar works requires some explanation, but a key ingredient is a sophisticated algorithm that is inspired by the way companies like Netflix and Amazon recommend movies, books, and other products to their customers.

How it works

To understand what’s new in Quasar, it helps to think about how data centers are managed today.

Data centers run applications such as research and social media services for consumers or large-scale data mining and data analysis for businesses. Each of these applications places different demands on the data center and requires different amounts of server capacity.

The cloud ecosystem includes software developers who run applications and cluster management tools that decide how to distribute the workload and assign which applications to which servers. Before making such assignments, cluster managers typically ask developers for the capacity that these applications require. The developers reserve the server’s capacity as much as you could reserve a table in a restaurant.

“Today the data centers are managed by a reservation system,” said Kozyrakis of Stanford. “Application developers estimate what resources they will need and they reserve that server capacity.”

It’s easy to understand how a reservation system lends itself to excess unused capacity. Developers are likely to err on the side of caution. Because a typical data center is running many applications, the sum of all of these overestimates results in significant excess capacity.

Kozyrakis worked with Delimitrou, a graduate student from his Stanford Laboratory, to change this dynamic by moving away from the reservation system.

Instead of asking developers to estimate how much capacity they are likely to need, Stanford’s system would start by asking what kind of performance their applications need. For example, if an app involves user requests, how quickly should the app respond and how many users?

Under this approach, the cluster manager would have to ensure that there was sufficient server capacity in the data center to meet all of these requirements.

“We want to move from reservation-based cluster management to performance-based data center resource allocation,” Kozyrakis said.

Quasar is designed to help cluster managers meet these performance goals while using data center resources more efficiently. To create this tool, the Stanford team borrowed a concept from the Netflix movie recommendation system.

If you liked this app …

Before you dive into Quasar’s algorithms, understand that servers, like some people, can multitask. So, the easiest way to increase server usage would be to run multiple applications on the same server.

But multitasking doesn’t always make sense. Take parenting, for example. A mom or dad can do the dishes, watch TV, and spell another word to help a child with homework. But if the question was about algebra, it might be wise to dry your hands, turn off the TV, and watch the problem.

The same goes for software applications and servers. Sometimes different applications can coexist on the same server while still meeting their performance goals; other times they can’t.

Quasar automatically decides what type of servers to use for each application and how to multitask on the servers without compromising any specific task.

“Quasar recommends the minimum number of servers for each application and which applications can work best together,” Delimitrou said.

It is not easy.

Data centers host thousands of applications on many different types of servers. How does Quasar match the right applications with the right server resources? Using a process known as collaborative filtering – the same technique that sites like Netflix use to recommend shows we might want to watch.

By applying this principle to data centers, the Quasar database knows the performance of certain applications on certain types of servers. Through collaborative filtering, Quasar uses this knowledge to decide, for example, what server capacity to use to achieve a certain level of performance, and when it is acceptable to multitask on servers while expecting to good results.

Thomas Wenisch, professor of computer science at the University of Michigan, is intrigued by Quasar’s article, in which Kozyrakis and Delimitrou show how they achieved utilization rates of up to 70% on a test bench of 200 servers , compared to 20 percent, while meeting strict performance targets for each application.

“One of the reasons the Quasar article is so compelling is that it has so much supporting data,” Wenisch said.

Next steps

Increasing the efficiency of data centers will be essential for the growth of cloud computing. These facilities consume so much electricity that increased demand threatens to overload the output of power plants. So adding more servers in the data center is not the solution, although money was not an issue.

But as they seek increased efficiency from multitasking servers, data center operators must deliver consistent service levels. They cannot allow some clients to suffer because the servers are handling the wrong mix of tasks, a deficiency known as “queue latency”.

“The explosive growth of cloud computing is going to require more research like this,” said Partha Ranganathan, senior engineer at Google who is part of the team that designs next-generation systems and data centers. “Focusing on managing resources to meet the twin challenges of power efficiency and tail latency can have significant benefits. “

Kozyrakis and Delimitrou are currently improving Quasar to accommodate data centers with tens of thousands of servers and to handle applications that span multiple data centers.

“No matter how we manage the resources in a data center, there will always be cases that exceed its capacity,” said Delimitrou. “Efficient offloading of parts of the work to other facilities is essential to achieve the flexibility promised by cloud computing.”

Tom Abate is Assistant Director of Communications at the School of Engineering.

Tom Abate, School of Engineering: (650) 736-2245, [email protected]

Dan Stober, Stanford News Service: (650) 721-6965, [email protected]


Comments are closed.