HECToR, the Edinburgh-based supercomputer used by UK researchers to tackle science's more thorny mathematical problems, is having petabytes of disk and tape storage installed in a massive storage expansion.
HECToR stands for the High End Computing Terra Scale Resource and has been built in several phases. It is a Cray-built 800 TFlop XE6 supercomputer with 704 compute blades using AMD "Bulldozer" multi-core processor architecture with 90TB of memory.
The system takes up 30 cabinets in an Edinburgh University data centre the size of two tennis courts.
HECToR has a petabyte of disk space in another 10 cabinets, under the control of the Lustre parallel file system. There is a 70TB BlueArc (Now HDS) Titan 2200 backup NAS system with MAID (Masssive Array of Idle Disks) on a Copan Revolution Virtual Tape Library array. Crashed Copan was bought by SGI in June 2010. The BlueArc system holds users' home directories. Files are first backed up to the Copan storage, using Symantec NetBackup, and then archived off to a Quantum Scalar i2000 tape library with four LTO-4 drives, 1,300 tapes and a 1.02PB capacity.
This storage is being boosted with 7.8PB of DataDirect storage arrays – accessed by GPFS, not Lustre – and 19.5PB of IBM archive tape, so the existing archive infrastructure is obviously thought inadequate for this extended storage. This new capacity, designed and built by OCF, will be networked to HECToR, not actually integrated into its infrastructure; it's designed for the long-term and to be available to any HECToR successor machine. It's a separate silo.
DDN is supplying its SFA10K-X storage array. The tape facility is a high-end IBM TS3500 library which can have 15 frames connected together with a single robot system and uses IBM's TS1140 tape drive.
Supercomputing wiring hell: HECTOR cabling.
Professor Arthur Trew of Edinburgh University said: “Data persists beyond any computer, including HECToR, so we’re prioritising data storage, management and analysis. Doing this enables us to upgrade HECToR and integrate its successor without fear of impacting access to research data. Our expectation is that any future computer must be able to integrate seamlessly with our storage.”
OCF's MD, Julian Fielden, thinks one problem with big data isn't storing it but finding the stuff, accessing it and using it: "By making storage independent of the machine that generated it, combined with good network access and IBM’s parallel file system GPFS, the data becomes easy to locate and use by any researcher irrespective of location,” meaning anywhere in the UK. ®