Small but perfectly formed: Dailymotion's object storage odyssey
From Isilon to Scality to OpenIO and ARM-powered nano-nodes
Case study Paris-based Dailymotion is the world's second largest video sharing website after YouTube and its three-stage storage history has transitioned from a reliable scale-out filer, to a maturing object storage startup, to an even smaller firm in pursuit of scale and performance.
The story starts with Isilon, progresses through Scality, and then moves on to tiny ObjectIO. Interestingly, both Scality and OpenIO are founded and run by French entrepreneurs and execs, and Dailymotion is a French limited company.
Ç'est incroyable ñ'est-ce-pas*.
In 2005, Dailymotion launched and started out with a few Linux servers and RAID storage using SATA disk drives. But by mid-2006 content was growing at 15TB/month and surpassed the 100TB level. The Linux RAID system couldn't cope.
At the time, system architect Matthieu Blumberg dit Fleurmont said: "As a video content delivery website, our requirements focus on 24/7 availability." It came across Isilon and its use by MySpace. It tried out a 10-node IQ 6000 cluster and went on to a 24-node system, and liking the SmartConnect integrated load-balancer. The total storage pool amounted to 285TB.
In 2007 the company stated that, since deploying Isilon (PDF), it experienced record uptime and availability, with 100 per cent uptime for 1.3 billion pages on a monthly basis, and 37 million unique visitors per month.
Times change (1) Scality
Scroll forward to seven years, to June 2014, and Scality announces: "Dailymotion is one of the world's largest video sites with 120 million unique visitors, more than 2.5 billion monthly video views and 300,000 to 600,000 simultaneous VoD (video on demand) sessions. Dailymotion has selected Scality to provide content archiving and to feed its direct to consumer VoD streaming system for Dailymotion's 40+ million titles, each of which is encoded in seven different formats."
The company had deployed a 10PB Scality RING object storage system, which operated "in parallel with DailyMotion's existing Isilon system. Dailymotion chose Scality because the RING delivers excellent performance while being extremely cost-effective." Also Scality runs on Linux, Dailymotion's OS of choice.
Scality put out a video about how it was dealing with the site's storage.
Pierre-Yves Kerembellec of Dailymotion said its storage amounted to about 35PB, holding around 50 million videos in several formats. It was experiencing a 35 per cent growth rate a year, requiring 5PB/year of new storage capacity.
It has a cluster of 51 HP servers with 3,660 disk drives. Kerembellec said its main storage challenges are TCO (total cost of ownership), failure resiliency and ease of use. The ability to choose hardware supplier to provide the best density is important. Scality's object storage resiliency, based on erasure coding, has a 25 per cent overhead but is quite efficient compared to pure replication.
Scality storage could be slotted in with no application changes, and Dailymotion enjoyed good ease of use on commodity hardware, as well as, presumably, a lower TCO – in contrast with Isilon and its proprietary hardware.
Times change (2) OpenIO
However, we have learnt that DailyMotion has recently decided to use OpenIO object storage.
January 2017 OpenIO meet up at Dailymotion
This storage uses an array of nano-nodes – disk drives with attached ARM processor systems. These are hyper-converged systems, with server, storage and networking (Ethernet) per disk drive (find out more here.)
Moving from object storage nodes which are HP servers with locally attached disk drives to object storage nodes which are disk drives with micro servers attached is quite a jump.
So what's happened with Scality and OpenIO?
Jérôme Lecat, Scality's CEO, said: "Dailymotion has had a dual-sourcing policy for several years. Initially Scality came in as the second source vendor. We are now becoming the prime vendor with dozens of petabytes in production on Scality Ring, and Dailymotion is selecting a second source. Its initial vendor, who was a traditional appliance vendor, is being phased out. This is actually one more confirmation point that the future of storage is software rather than appliance based."
The very big and the very small
We understand that OpenIO talks of theoretical 20,000 node systems, meaning 150PB using 10TB drives and 25 per cent overhead.
We also know that OpenIO's system can run on SSDs; indeed a certain Japanese service provider does just that, with 10 drives per location. With the software able to run on Raspberry Pi processors then a shared and resilient all-SSD-based OpenIO system could run in modern IoT environments, such as cars, vans and lorries, providing the onboard storage and processing necessary, with relevant data transferred to similar but larger scale systems in data centres for further analytical processing and longer-term storage.
Scale-out micro hyper-converged systems could have the right scale for constricted IOT environments such as cars, while also coping with the massive scale of 40PB data centres, witness Dailymotion. Well, well, who'dda thunk it? ®
*It's unbelievable, isn't it?.