This article is more than 1 year old

Ice cold: How hard man of storage made Everest climb look easy

Rack 'em and stack 'em: The only thing in the cloud was Sagarmatha’s peak

Render workload and infrastructure

Everest was created as a 3D IMAX movie for Universal Studios and was also shown in ordinary 3D and 2D. Each frame in the 2D Everest movie is 8MB and there are 24 frames per second (fps), meaning 192MB for each second of footage.

Gomes tells us each frame can take 8, 12 or even 16 hours to render. A characteristic of this is that it provides a predictable server resource budget – for example, 50 servers for two days. There were some 900 VFX shots for Everest, and RVX did about 500 of them with sub-contractors such as Framestore and ILP doing the remainder.

Digital artists use software tools such as Maya, Nuke, Houdini, PFtrack and ZBrush, operating inside an NFS environment, to create the effects they need, such as adding condensed breath vapour, or ice breath, to the climbers as they ascend high and snowy slopes.

ZBrush was used to create sculpted tent material folds, glacier and boulder details, and elements of the digital doubles’ clothing. In the crevasse ladder bridging sequence almost 100 per cent of the video effects involved ZBrush work.

Each time they work, layering on FX magic, they’ll render the resulting footage and then play it back to see how well they’ve done and if they need to do more. The faster they can turn around each cycle of their work the better, and the rendering infrastructure has to keep up with them – which is where Gomes had a problem to deal with.

RVX’s production workload came in from Kormákur’s film cameras, Pinewood Studios, and also its sub-contractors. For IMAX and other 3D cinemas, 2D-to-3D conversion was done by Stereo D.

The overall rendering workload has three components and three corresponding storage nodes:

  • Read-intensive storage node to provide recorded footage for the artists to work on
  • Write-intensive big heavy calculations, such as spindrift vortices, using big files and taking up to 90 hours in extreme instances with 300GB data files being generated, and directed to a separate storage node
  • Read- and write-intensive compositing with jobs such as reading 1,000 frames, making some change, and writing them back. Also 3D renders with Maya

Gomes looked at HP 3PAR, Dell EqualLogic, NetApp, and IBM’s V7000 but didn’t have the budget to afford them, choosing Infortrend and Supermicro gear instead, adding that the “V7000 has nice features. It’s fast and can control other storage, like Infortrend. But the licensing scheme is where things fall apart, with licensing per TB, and I steer clear of that.”

“EqualLogic and 3PR have the tools available but I can’t justify the price. I need lots of space and speed, so I save money on bells and whistles,” he added.

The 500TB RVX storage infrastructure has 3x Infortrend chassis, each with 16 disk drives. One unit has 15K SAS drives, a second 24TB of SATA, and a third 7TB of home directory-type data. There are also newer Supermicro storage chassis, each with 36 SAS disk drives and 2 x SSDs for RAID controller metadata.

Gomes said he decided not to use 72 x HDD chassis as the density was too high. Gomes likes Supermicro’s reliability, saying “I have never had a dead-on-arrival Supermicro chassis.”

Entering the Verne_Campus

Entering the Verne data centre campus. Feel the cooling power of all that snow

There is a 200TB, 15U storage server unit for nearline storage. It uses the BTRFs copy-on-write filesystem for Linux with compression on the fly.

Gomes says there is about a millisecond’s latency talking to the data centre. This of itself isn’t too bad but, when artists open a directory with lots of files their system could hang for a second or two. He thinks traditional WN optimisation products, the Steelhead-type approach, wouldn’t have worked well enough to kill the latency issue that affects some workloads.

Gomes wanted the nearest thing possible to having a file store located in the digital artists’ office, which was a high-performance caching buffer. A trial involving Linux’s FS cache was looked at but Gomes decided it would take far too long to get it working.

The chosen kit was a clustered pair of Avere FXT 3500 edge filers which cache both read and write IOs in a multi-level memory, NVRAM, SSD and disk design. They each have 2 x 10GbitE links to the data centre and provided the fastest and most cost-efficient way of solving the problem.

He has rendering done both in the artists’ office and also in the Verne data centre using a 6-blade server system, thus spreading the load. These FXT 3500 systems provide a 5TB working set size, more than enough for Gomes’ current needs. If he needs more back-end storage capacity he can just plug in another Supermicro storage unit and the FXT’s will cope.

This decision meant they made the most cost-effective use of their leased line and its per GB transferred-based pricing.

If the RVX artists' workstations hooked up directly to the data centre storage units then render queues could build up. In one incident a render queue to one server did become full and it was taking up to 10 seconds to respond. But, because front-end FXTs were in place this was all hidden from the workstation users, who saw no delays.

Gomes found out a storage load needed re-distributing across storage units, with information from the FXT management console, and sorted the problem in 40 minutes.

In-progress RVX production work is protected by snapshots to nearline storage. Post-product archives are stored in LTO-5 tapes using a pair of Storage DNA drives. It’s the most cost-effective method.

More about

TIP US OFF

Send us news


Other stories you might like