Why two scale-out NAS, IBM? One's a pickup, the other's a juggernaut
Spectrum NAS overlaps with Spectrum Scale, but there are differences
Analysis IBM already had a scale-out NAS (filer) when it announced Spectrum NAS last month: Spectrum Scale, which can grow to 16,000-plus nodes. Why does it need another?
The two overlap in the same way as a Dodge pickup and a Mack truck. Sure, they can both carry small loads but using a Mack truck for Dodge pickup loads is a waste of money. If they both turn up for a load then one of them is in the wrong place.
We know how to contrast and compare pickup vehicles and trucks. How are Spectrum NAS and Spectrum Scale alike yet different?
Spectrum Scale is IBM's mature scale-out and parallel access file system that supports from 1 to 16,384 nodes. It used to be called GPFS (General Parallel File System). A specific CES (Cluster Export Services) cluster of its nodes provides NAS access as a gateway to Spectrum Scale data.
CES supports NFS v4 and Server Message Block (SMB) v3 access. It is based on Samba software and IBM has good links with the Samba and Microsoft SMB people.
There can be from 1 to 16 SMB nodes. IBM does not say how many NFS nodes there can be in a CES cluster.
Spectrum Scale and Scale CES can run on x86, POWER and z System (mainframe) hardware, and these must run the RHEL 7 operating system. All nodes in a CES SMB cluster must be identical.
IBM doesn't say Spectrum Scale CES has no single point of failure but does claim high availability.
All Spectrum Scale CES nodes see the same configuration data. The state of opened files is shared among the CES nodes so that data integrity is maintained.
There is a central CES address pool of IP addresses distributed among the nodes.
We understand that Spectrum Scale CES can use the full features and performance of the GPFS filesystem but the setup is a little bit like in a F1 race car – you need the good support team and the full understanding. CES also offers more than just SMB and NFS access, providing iSCSI, Swift/S3, OpenStack and Unified Objects.
IBM has a graphic positioning Spectrum NAS, Spectrum Scale and its Cloud Object Services:
IBM positioning of Spectrum Scale (CES), Spectrum NAS and Cloud Object ServicesCOS
Spectrum NAS was introduced to provide a scale-out NAS cluster supporting from four to tens of nodes, each of which must use identical x86 server hardware. It is based on Compuverde vNAS software. Compuverde talks of scaling out to hundreds of nodes, and IBM doesn't identify an upper scaling limit.
It appears, then, that it may outscale Spectrum Scale CES as an SMB-accessed NAS system.
The nodes are commodity x86 servers, either bare metal or virtual, and Spectrum NAS is a bootable software stack.
There is a single namespace and it's claimed bottlenecks and single points of failure are avoided.
Both disk and flash node storage is supported with, obviously, flash providing higher performance.
All cluster resources are aggregated, meaning CPUs, storage, cache and bandwidth. Every node has knowledge about which node owns a copy of any given data.
Spectrum NAS is self-healing. There is erasure coding with data striped across nodes and locations, not just disks.
A virtual IP mechanism is used to ensure that all nodes in a cluster appear available at all times, even when a particular node is taken down for upgrade or has failed.
Spectrum NAS supports a wider range of NFS protocols than Spectrum Scale CES; v3, v4, v4.1, as well as SMB ones; v1. v2. v3.
Microsoft and Compuverde have entered into a licence agreement to enable access to Microsoft's SMB file transport technology for Compuverde's software. The agreement includes access to future generations of SMB.
The Compuverde software also provides iSCSI, OpenStack Swift, and Amazon S3 access support, but we don't know if IBM's Spectrum NAS provides this.
Spectrum NAS has intelligent locking and supports snapshots.
It can be set up and/or upgraded in 30 minutes. This is on a per-node basis we understand. Rolling upgrades can be performed across a Spectrum NAS cluster.
Spectrum NAS and Spectrum Scale CES positioning
Our understanding is that you should use Spectrum NAS and not Spectrum Scale CES if:
- You don't intend to scale past hundreds of nodes
- You don't have parallel file access requirements
- You don't wish to be involved in complex software support and optimisation
Specifically Spectrum NAS is for home directories, general and virtual machine file serving, and to provide NAS storage for Microsoft applications. Spectrum Scale is to provide fast-access storage for compute clusters, big data analytics, machine learning and deep learning, and fast backup and restore.
Think of Spectrum NAS as like a neighbourhood coffee shop. Spectrum Scale is then like a 2,000-room hotel with a coffee shop on its premises. You can have coffee in both places, but that's not the main reason for going to the hotel. ®