Object storage adoption: Why, when, where… and, importantly, but
Build a better interface and the world will... wait, where are you all going?
Comment In one of my recent posts, I wrote about private object storage not being for everyone, especially if you don’t have the size to make it viable. On the other hand, we are all piling up boatloads of data and users need to access it from many different locations, applications and devices at anytime.
Object storage characteristics are ideal for building a horizontal platform for this kind of job, and sometimes it makes a lot of sense to implement an on-premises infrastructure, even if dealing with smaller capacities – in this case, small means in the order of a hundred Terabytes.
If object isn’t for you today, there is a chance it will be tomorrow. In this post I would like to recap some of the benefits of object storage and share some ideas about where and how to start thinking about it.
Why Object Storage
You can find many object storage products in the market now. Some of them can start out quite small, while others make sense only if dealing with multi-petabyte capacities. Different architectures can contribute positively to manage smaller or bigger objects or specific workloads, but they are all now supporting a similar set of APIs for accessing data, with Amazon S3 APIs currently winning over Swift.
In fact, support for the S3 API is the first feature end users usually look for, because it greatly simplifies the search for solutions at the front end. An object storage infrastructure has to be considered as a common horizontal platform on which to provide different services. In some cases, the object store can provide some of these services – scale-out NAS, for example – but most of these services are implemented through external appliances or applications which leverage APIs.
Object storage systems commonly have some basic characteristics, such as multi-tenancy, security, geo-distribution, automatic data replication, policy-based data protection, very high resiliency and reliability, as well as high availability. Performance is not usually listed on top, but depending on the use case or specific implementation, it could be. Infrastructures are built on top of commodity hardware and the architecture design usually involves distributed nodes.
All these combined characteristics drive down both TCO (total cost of ownership) and TCA (total cost of acquisition) considerably, to the cents/GB level. And this is another reason why you could be interested in it.
When and where to use object storage
The use cases where object storage can be a good fit are many, especially if your organisation is developing new applications capable of leveraging it. But, in this post, I wanted to focus just on infrastructure and mostly on off-the-shelf solutions. In fact, many object storage end users start adopting object storage with traditional protocols or applications and then add more over time.
If a similar strategy isn't adopted – starting out small and growing over time with the number of applications and capacity – object storage will just be a small isolated storage island and it won’t be worth the initial effort in the long term. In this case, it could become more of a problem than a solution.
Back to possible adoption scenarios:
NAS comes before anything else. It could sound weird, but this is actually what many end users ask for – traditional NAS, distributed NAS and scale-out NAS – and I’m not talking just about capacity. By decoupling capacity (object store) from the front-end (external appliance with cache and efficiency) it is possible to serve any kind of high-performance workload without thinking about traditional issues connected with the classic NAS. These include back-up, DR, capacity management and so on.
A vivid example here comes from Avere Systems, which can serve even HPC workloads with just a bunch of its appliances at the front end: even on remotely deployed and high-latency object storage systems.
- Sync & share (S&S) is one of the most common applications with an object store at the back end. Dropbox and all the others are based on object storage, for example. This solution has many benefits, especially when there is a high number of remote offices and mobile workers in the organisation. In this particular case, it’s not about the quantity of data, but it is much more about keeping control over your data while giving end users the best in terms of data mobility – aka good user experience in a walled garden.
- All the rest of your data. In this category you can find many different applications, ranging from active archiving to back-up in various forms. In fact, the number of primary storage vendors supporting the S3 API to make clones or copies of data volumes is increasing, as well as back-up vendors that are now supporting object storage as a target. In the same category, despite the application being totally different, some analytics applications are starting to take advantage of these kinds of repositories for storing data. In the most advanced implementations, they are leveraging them for in-place data analytics too.