This article is more than 1 year old

Google's Urs Hölzle: If you're not breaking your own gear, you aren't ambitious enough

Infrastructure king on next-gen memory, FPGAs, and more

Who keeps Google running, the engineers or their software?

When we look at evolution of stuff like cluster management systems, such as Omega or Borg, there's an emphasis on autonomic computing or ceding elements of control or decision-making out to the edge. How far are you willing to go on that?

Generally speaking, a centralized system is simpler because all the state is in one place and you have one algorithm that makes a decision and then it sticks, whereas a decentralized system can react faster. You may get into much more complexity because different actors are making independent decisions and they may not actually align.

You can have emergent problems?

Well exactly, and they fight each other. You have to find the right balance and like any engineering decision this isn't a black and white [choice]. There's no one true answer. It's good taste and actual experience.

One of the biggest things that helps us is we've been doing this for ten years so you get a better feel for what works and what doesn't. I think in pretty much any system we have centralized control and decentralized control and we think fairly carefully where we put the individual pieces together.

So, for example, a trivial example in cluster management: if a task unexpectedly dies – segfaults – should you restart it locally or should you restart it globally? That's a relatively easy decision because most likely the local restart will fix it because this was some random corner decision and restarting the app right there, in the same slot, is gonna work. At the same time if that happened a hundred times in a row should you continue to try and fix it locally? No, probably not.

Something that is trendy at the moment is using deeper neural networks or hierarchical neural networks to manage infrastructure – where is Google on that?

Generally speaking neural networks and machine learning, applied to any situation where you have large-ish historical data, or real-time data, and you can train on that, can find something more useful in the information. One of the things we recently wrote about was our data center building management. [Where Google disclosed how an employee used a neural network to better predict how data center power usage effectiveness (PUE) responded to tweaked inputs – Ed].

That's a great example of how powerful this approach is. What we did there is construct a model and actually now the [datacenter operations workers] use it for telling them what to do. That model does not know how the data center is built, it is purely statistically inferred, and we can do that because we had millions or billions of lines of historical data points about the system configuration and the PUE, and that lets you learn about which factors matter.

We've had it make very interesting recommendations that the local team responds: 'Wow, we would never come up with that.'

That is a prototypical example of where it is helpful. Not to substitute for human judgement, the operator is still responsible, but for keeping track of the minutiae and mining interesting options out of it and that can happen with error analysis - are programs behaving inappropriately, right now? If you have a lot of historical data you can do very well. One of the things we're aiming for, and we talked about for the cloud, is actually helping you run your application by automatically giving you alerts that you didn't configure.

One of the areas we've seen Google do commercially and internally is hooking multiple DCs together via tech such as global load balancing; what are some of the challenges there?

It's not a focus at all today, and the reason is it was a focus seven years ago. Today it's a solved problem. The work we had to do was on the UI, to make it configurable by someone who lives in our cloud product rather than our internal product. We didn't write any new load balancing code because this is the same code we wrote a long time ago, and is really battle tested with This is a solved problem.

One of the things that helps us be competitive in this field, and have some deep features rolled out relatively quickly, is we actually did the work a long time ago.

This is about packaging it to you as an external programmer. That is the key challenge we face. It is a tricky thing - for example, this load balancing was not easy to get right because one of the things you have is feedback loops that if you try to react to it automatically you may do more damage than you're preventing. That is a common occurrence with the sort of global load balancing that we struggled with a while ago, maybe seven years or so ago, to really get right, and the great thing is as a user of the external cloud you benefit of all that work. It's unlikely to happen to you because it happened to us five years ago.

Microsoft recently published a paper about a system called Catapult that paired x86 servers with FPGAs for use in Bing search – what's your thinking on adding in something like adding in an FPGA or custom chip?

The struggle I think you have always there is sort of between velocity and efficiency. The problem is FPGAs are hard to program, and if you have underlying software that changes literally every day – like many of our systems do pushes every day – it's kind of really hard to match that with a hardware-ish cycle. So that constrains it down to some specialty applications, but I can totally see that for a large and relatively static app that they use for scoring in the search engine, there may be an actual payoff.

The second challenge though is you need to look at this in a three year period, because in a three year period even scoring may actually radically change, it certainly has for us. Then you have the sunk cost problem where this thing is still there and now you can't use it anymore with the system or it's paired wrong with the CPUs, too much FPGA per CPU or too little FPGA per CPU, that makes the cost equation a little bit harder.

I thought it was a very interesting experiment. One of the ways to get efficiency - both cost and energy - is to have specialized circuits for sub-functions, it's certainly true of graphics - it's definitely happened in GPUs. But you basically do the same thing in a Microsoft-ish way for applications, the problem is either they need to be a large class of things like graphics, a large class of app scheming, or you need a very vertical and very large and very static one and that's still going to be a problem - I wouldn't expect that 50 percent of servers in the cloud have FPGAs in the next few years unless they really become much more programmable.

There's a lot of academic research there and they might actually, but so far it hasn't really worked. That would be exciting because it makes one more piece of hardware more software-ish, more programmable. ®

More about


Send us news

Other stories you might like