Coho Data storage arrays will be able to run Docker containers directly on the storage nodes and use Google’s Kubernetes interface for configuring and deploying microservices.
Startup Coho Data says its customers can now run new data-centric services and apps directly adjacent to stored data or, putting it another way, “allow third-party applications to run directly within its customers’ enterprise storage systems”.
It’s trying very hard to say that its scale-out, hybrid flash/disk and all-flash MicroArrays are not HCIAs (Hyper-Converged Infrastructure Appliances) like those of Nutanix, SimpliVity and their colleagues. We’re told Coho’s arrays only do closely-coupled storage/compute work, such as video stream transcoding. Cynics might suspect that’s because they only have poky little CPUs.
The arrays hook up to servers using Ethernet and can be viewed as data-massaging arrays, getting data into the shape needed for applications to reference it from their virtualised or physical servers.
Anyway, what you can do with Dockerised Coho arrays is to run Splunk and other analytics directly on the array. Coho said: “Splunk computational container nodes on Coho can be done in a split second without the overhead of setting up an additional, external Splunk cluster.”
Coho Data DataStream MicroArray
It also said: “Containers provide an opportunity to incorporate third-party logic for enhanced data protection, including back-up agents, malware scanners and e-discovery and audit tools directly within the platform.”
Coho Data co-founder and CTO Andrew Warfield gave out a canned (containerised?) quote: “Container convergence marks the end of enterprise storage being viewed as a closet, where data is packed away inside cardboard boxes. This exciting evolution of our platform allows data to become active, easily supporting the addition of new protocols and directly integrating with customer workflows.”
Saying the data has become active is a soundbite but, let’s not beat about the bush, it’s rubbish. The data gets processed, either in a server hooked up by Ethernet or a server that’s also a MicroArray controller. It just sits there in disk or flash, gets fetched to memory, processed by a CPU, and then gets written back. And that’s the same whether the array controller/server does it or a networked server does it.
Arguably, Coho Data uses spare storage controller cycles to do computation on the data: that’s all. Fine, Violin Memory can do the same thing in principle. The Coho Data array is an HCIA-lite device, being unable to host virtual machines on the array, and you still need apps in networked virtualised servers.
Warfield blogged: “Hyper-convergence has solved a virtualisation-specific storage problem (making storage invisible) in order to present the virtual machine as the core top-level abstraction. We believe in a more data-centric view of the world.”
Coho explained: “Enabling containers, as well as an internally hosted Docker registry, opens up a wide catalog of applications that can be hosted within the [Coho array] and allows for a community of contributors to extend and evolve Coho’s data platform.”
It cites some Microservice examples:
- New protocols such as S3 API support
- Ability to instantiate useful tools like Splunk light<?li>
- Big Data capabilities such as Cloudera’s CDH5 Spark and MapReduce
- On-demand video transcoding
- Live-search facility to find and extract documents from VDI environments
The Warfield blog on Coho and containerisation said: “Converging containers into enterprise storage as a means of adding active execution to stored data [provides] opportunity to analyse, transform, and present data in new ways, even as new scalable software services, within the enterprise environment. Our customers can add container images to a Docker registry that runs within the platform, and these images may be composed into rich applications and micro-services that are described using Google Kubernetes APIs.”
Whether Coho stays as a non-VM-executing array and whether it should depends on your view that a separation is needed between VMs executing on network servers accessing the array and microservices in containers on the array. Both are effectively running apps that access data on the array. Should they be separated by a network link?
Check out a Coho DataStream 800h hybrid array data sheet here (pdf). Coho’s next DataStream software release will enable customers to instantiate Docker-based applications within the storage system. The availability hasn't been specified but we think it will be soon. ®