Channel

This article is more than 1 year old

Massed x86 ranks 'blowing away' supercomputer monoliths

Dell pitches modular parallel processors

Thu 14 May 2009 // 08:02 UTC

It's all money in the end

A lab scientist costs £100k/year. You can double that for an experiment. He has 300 users and they cost £60m/year. The move from Sun to Dell and a tenfold performance increase must have improved the output of his users. "It's all money in the end, taxpayers' money."

Calleja upgrades his hardware every two years on a rolling procurement and keeps hardware for four years. He delivers core hours to his users and has to continually demonstrate to them that paying for his core hours is cheaper than buying their own compute facilities. He said: "We're the only fully cost-centred HPC centre in the country not relying on subsidy. We have 80 percent paying users and we're breaking even."

Why Dell? It's cheaper and extremely reliable compared to competing suppliers. He's experienced a 1 percent electronic component failure rate in two years.

He's limited by power and space constraints. Calleja is upgrading now and is deploying 50 percent more compute power for 15 percent more electricity, adding 10 percent to his space footprint and the new kit is 20 percent of the original capital cost. That means he lowers his cost per core and offers his users better value core hours.

He said there are three research pillars: experiment; theory; and simulation. Simulation, using a supercomputer, enables you to go places you can't get to by experimentation. The need for simulation is horizontal across science.

Research applications now use shared and open source code that can be parameterised to provide the specific code set needed by researchers, whose time is not best spent writing code. That has become too specialised a job.

Datasets are kept in the data centre, inside the firewall, and users come to the HPC mountain instead of the HPC mountain coming to users, with massive data set transfers across network links between users and the HPC lab.

Looking ahead

Calleja has two steps on his processor roadmap. He's looking forward first of all to Nehalem blade servers, 4-core Xeon 5500s, with possibilities for 6- and 8-core ones. The second step is to Sandy Bridge, Intel's next architecture after Nehalem which, he says, will run 8 operations per clock cycle instead of Nehalem's 4.

The blade servers will provide many more cores per rack, driving up the heat output, and he's anticipating moving to back-of-rack water cooling

He's thinking of setting up a solid state drive capacity pool for HPC applications that need the IOPS rates that SSDs can deliver, but SSD pricing has to come down to make this worthwhile. Lustre meta data might be stored in an SSD pool.

Topics

Special Features

Vendor Voice

Resources

Channel

Massed x86 ranks 'blowing away' supercomputer monoliths

Dell pitches modular parallel processors

It's all money in the end

Looking ahead

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Dell shaves months off lead times for GPU-powered AI servers

Los Alamos Lab powers up Nvidia-laden Venado supercomputer

Microsoft foresees a new type of AI PC: A Surface designed with help from machines

Getting on board with AI

AI cloud startup TensorWave bets AMD can beat Nvidia

HPE sues China's Inspur Group over server patents

Banned Nvidia GPUs sneak into sanction-busting Chinese servers

More than a third of enterprise datacenters expect to deploy liquid cooling by 2026

Intel Gaudi's third and final hurrah is an AI accelerator built to best Nvidia's H100

Intel's neuromorphic 'owl brain' swoops into Sandia labs

India and EU finally advance HPC collaboration project hatched in 2022

Next-gen Meta AI chip serves up ads while sipping power

About Us

Our Websites

Your Privacy