This article is more than 1 year old

A good time Woz had by all: We peeked our head into Primary Data and this is what we saw

Upcoming early-access product details revealed

Analysis Picture this. A little press conference with Primary Data at its headquarters in Los Altos, California, right in the armpit of Silicon Valley. CEO Lance Smith is briskly burbling away about his company, but us hacks are somewhat distracted.

The Apple wizard that was, Steve Wozniak, Stephen Gary Wozniak, age 66, is sitting right next to Lance as Primary's chief scientist. Primary's execs can hardly contain themselves. Look who we have on board, they quietly beamed. Beat that, you other startups.

Then it's the Apple II designer's turn to talk. We hold our breath. And he engages Woz waffle mode. It's a pleasure to talk to us. We're thanked for taking the time to come visit. He joined Primary Data, and he'd liked Fusion‑IO, because the company visions matched his ideas. The thing is to keep stuff simple. Kick out the middle stuff. Keep it simple and as direct as you can, by design. That was it, basically.

Primary Data update

Then it was over, the star turn had ended, and back came Lance to tell us about Primary Data and its current situation.

The DataSphere product can save large enterprises millions of dollars by cutting storage over-provisioning. It sits between an enterprise's applications and on‑premises and public cloud storage, providing metadata engine-driven data placement, tiering and protection service.

DataSphere is now being marketed, depressingly, as doing machine learning and providing data fluidity between on‑premises systems and the public cloud. It has smart objective analytics and a quality of service capability.


Primary Data CEO Lance Smith

The technology is delivered as the software-only base DataSphere product and DSX, a set of extended services providing a data portal, data mover, data store and a cloud connector.

The latest (parallel) NFS v4.2 is an important component, and it has native client support for DataSphere, with Primary Data saying it's the leading contributor to NFS, and has been since 2013.

Smith said DataSphere was in exploratory use at several if not many Fortune 500 enterprises, which were not ready or willing to talk about it because it was providing them with competitive advantages. Net:net – no customer references and, we think, limited revenue from customers.

The company has brought out a sort of DataSphere-lite, called DataSphere for Lines of Business. This is for smaller or remote offices, ones with four filer nodes or less, and tiers into the cloud to avoid on‑premises storage growth. The "cloud" means an AWS S3 object store and can be on‑premises or in‑public cloud.

It has automatic snapshot backup to the cloud or to a central data center object store. Primary Data intends to add sending the snapshot to another NFS volume later. There is a license-only upgrade to the full-blown DataSphere for Enterprise product.


DSX is software-only, has open source client code and scales out. It offers non-disruptive mobility – the ability to continue reading and writing to a file while it is being moved from one physical data store to another. A client requests that a file or files are moved to a new place (layout). DataSphere now routes access requests for the file to its data mover.


DSX data mobility slide

Customer client access to files relies on a DataSphere-provided access data path. While the data is being moved in the background, DataSphere provides access to the original store via the data mover. Once it is complete, the access path is changed to the new store with the DSX data mover exiting the data path.

DataSphere 2.0 early access

V2.0 DataSphere, now in a limited early access release, has:

  • Analytics-driven movement of inactive data
  • SMB and Active Directory, Windows ACLs, X‑Domain mapping
  • Control and recovery without interruption – assimilation, snapshot archival and backup enhancements
  • Expanded connectivity – VLAN, virtual networks and IPv6
  • Objective expressions
  • Portal protection, metadata backup and restore

DataSphere says the analytics-driven data movement goes beyond POSIX metadata and automates the movement of data with file granularity to the (S3‑compatible) cloud, using the parallel DSX cloud connector with integrated variable-block size deduplication and compression. If moved data is accessed it is automatically brought back on‑premises.

DataSphere v2 Objectives

DataSphere v2.0 intelligent placement

Assimilation means the capture by DataSphere of existing NFS storage's metadata without interrupting data access. The assimilation of NTFS attributes is done as a post-process.

Snapshots are non-disruptive metadata snapshots – the actual data copy (if needed) happens inside existing storage. Users can move or copy the data stored in a snapshot to the cloud to protect the data without affecting primary storage capacity.

Objective expressions


Primary Data founder and CTO David Flynn

An objective expression is some piece of metadata describing data. At its simplest it is such things as file size, the space used, or whether it is a live snapshot. But Primary Data founder and CTO David Flynn said they could be much more powerful than that.

He talked about programmable objective expressions and routines, such as automatically generating metadata tags for incoming files, and using the tags to filter files, stick them in groups, and do things with them.

For example, all incoming files that come from a particular source could be tagged as country location-dependent and not moved beyond a national boundary. The idea is to automate the identification of files that meet certain criteria and carry out some action on them, saving storage admin time and money and also opening the door to file-based filtering, grouping and actions against vast populations of files where manual activity is impractical.

Reg comment

Primary Data is in that phase of startup development where customers are seriously trying out its products and product engineering is partly focussed on refining the product to make a better fit to customers' requirements.

The company took in its last funding round, a $13m B‑2 round, in 2014. Here it is three years later with the product being proved in real-life use. If this is successful, we might expect a further funding round to cover the costs of building out an extended go-to-market business infrastructure and further engineering.

El Reg thinks that other companies are entering the hybrid cloud data management space, such as Actifio, Catalogic, Cohesity, Komprise, NetApp, Rubrik and others. They may well think that they do not compete with one another. We wonder if, once a customer has bought into the hybrid cloud data management ideas of one of these vendors, it will buy products from another?

David Flynn thinks they will. We suppose that these suppliers will have to be crystal clear to themselves and their customers about the use cases that suit them and differentiate them from the other data management vendors. ®

More about


Send us news

Other stories you might like