This article is more than 1 year old

Caringo predicts stiffer penetration of big data boxes

Pokes stuffed filers with hockey stick growth

Marketing veeps love hockey sticks and the idea that sales growth could accelerate in a way resembling the curve of such a stick on a chart really turns them on.

Hockey sticks are hard to find if you are in object storage but glimmers can be seen, especially through marketeer's telescopes, and storage biz Caringo thinks, and hopes, that vast vaults of unstructured data will deliver the hockey stick it wants.

Caringo CEO Mark Goros spoke at a press briefing in Sunnyvale and said that CAStor, Caringo's content-addressed store, is in its fifth version and rock-solid.

It is OEMed by Dell for its DX6000 object storage array and that has given Caringo lots of extra heft in the marketplace. Every object storage supplier is convinced that scads of unstructured data is flooding into business filers and overwhelming them. Most of it is fixed and rarely, if ever, needs changing so there is no need to store it in filers built for storing data that changes often.

Step forward object storage, which says it can store this fixed content more efficiently than filers, with higher performance, and much greater scale. Here's where El Reg starts nit-picking, as these claims are relatively untested and unproven. Most actual implementations of object storage have not been at fantastic scale and have not shown up the inadequacies of filesystem-based storage in performance, efficiency, and scalability terms.

Indeed, with EMC buying Isilon and IBM boosting SONAS performance with flash, object storage looks as if it's struggling to keep up.

Forget the scalability and performance and efficiency points. The key thing is this: object storage is cheaper than filers because you don't need RAID, you don't need fancy array interconnects, and you can use cheap and cheerful JBODs. Yes, it does scale although filesystems can slow down as they fill up, but until you're storing north of multiple tens of PBs and beyond these imitations won't necessarily show up.

Nits picked, let's return to Caringo. It has about 400 customers with 100 or so of them coming in through the year-old Dell OEM deal; that's how important Dell is to Caringo.

What a dose of CAStor oil does

CAStor provides a single flat address space for objects, which contain all of a file's data plus all of the system and any user metadata, and are identified by a globally unique 128-bit UUID string.

Objects are written sequentially and a fresh object is written at the end of the current objects on a node's drives. In other words it is appended. Changed objects are written as new objects and the older version of the object marked for delete and space recovery by a background garbage collection process.

There are no actual numbers saying CAStor gets data faster than a file system

Object UUIDs are kept in RAM for extremely fast look-up, and this UUID table is built afresh whenever the system is booted. System metadata holds lifecycle information such as whether the object is immutable or not. An object is contiguous on disk and not split into 4K blocks as with a file system, so reading is faster as it's sequential.

The minimum cluster size is 3 and nodes are peers. An object write gets a copy written to a second node for protection. There is no central map of which node has which object. If the cluster loses a node, the system rebuilds its contents from the replicas distributed around the other nodes. As objects are rebuilt then they spark a re-replication as replicas of objects on the lost node are re-replicated.

A hot object can be replicated in RAM to avoid bottlenecking on spindles.

Caringo and filers

Goros said: "CAStor is biased to storing fixed content that doesn't change much. Changing data is not a use case for us... We are not looking to replace file storage. Our customers tend to be building new applications in medical, government, and the media and entertainment areas ... The original mission of the company was to change the economics of fixed content storage."

Caringo, and by extension object storage in general, is not aiming to replace filesystems with object storage. Instead it aims to provide a more cost-effective alternative to filers for fixed content data.

Some object storage marketing says object storage is simply better than filesystem storage. For example, a Caringo spokesperson thinks CAStor is four times faster than general filers. Another point made is that CAStor doesn't use RAID – and RAID rebuilds are slower than CAStor drive rebuilds.

But there are no actual numbers saying CAStor gets data faster than a file system or rebuilds its drives faster than a RAID rebuild. And we infer that customers are not buying this because it's a faster alternative to filers.

Next page: CASTor oils roadmap

More about

TIP US OFF

Send us news


Other stories you might like