Calxeda is also cooking in that homegrown interconnect, which has yet to be given a name outside of the company. It is not clear how the Calxeda interconnect will hook into the Cortex-A9 chip, but that ARM design allows for two 64-bit Advanced Microcontroller Bus Architecture (AMBA) Advanced Extensible Interface (AXI) ports, with a combined 12 GB/sec of bandwidth into the system interconnect on the chip. It may be that Calxeda is interfacing a whole different protocol onto the chip – perhaps InfiniBand or 10 Gigabit Ethernet – right down on the chip, interfacing with the AXI ports. This would be the simplest and cheapest thing to do.
Because the Cortex-A9 is only a 32-bit processor, the Calxeda server nodes will top out at 4 GB of main memory per node. That is the upper limit of addressability for a 32-bit processor, of course, and in this case, it will be a single 4 GB stick of low-power DDR3 memory in a single slot.
Freund says that a quad-core A9-derived processor, plus its memory controller, the DDR3 memory module, and the on-chip fabric interconnect will burn only 5 watts. Clock speeds were not divulged, but it will probably be somewhere between 1 GHz and 2 GHz. That is less juice than a fat DDR3 memory stick uses, forget about the Intel or AMD x64 chip.
"This gives us extremely high levels of density," says Freund. And, the fabric interconnect will allow for "multiple thousands of cores" to be lashed together and controlled as a unit. (But not in a cache-coherent, shared memory manner. Don't get the wrong idea.)
The Cortex-A9 does not have any circuits to do virtualization, but Freund says that on the workloads that Calxeda expects customers to use the chip for, they won't need hypervisors to carve up the servers. The will already have parallelized workloads that span thousands of nodes that run at very high utilization rates. On an X64 server, you use a hypervisor to plunk multiple server images on one set of chips, workloads that might only consume 5, 10, 15, or 20 per cent of the raw CPU capacity by themselves, driving up utilization of the overall system.
That said, hypervisors and their control freak add-ons are also useful for managing workloads and spreading running workloads around a cluster of machines. Freund says that Calxeda is participating in the OpenStack cloud fabric effort to see how to adapt these tools to manage bare-metal images instead of virtual images on machines using its ARM variants. The Linux community is also working on software container technology for ARM chips, too, according to Freund, which could be useful for some workloads.
Calxeda is not going to make and sell servers, but rather make chips and reference machines that it hopes other server makers will pick up and sell in their product lines. The company hopes to start sampling its first ARM chips and reference servers later this year. The first reference machine has 120 server nodes in a 2U rack-mounted format, and the fabric linking the nodes together internally can be extended to interconnect multiple enclosures together.
The initial workloads that Calxeda is targeting include internet-scale web serving, of course, as well as streaming content delivery (so long as it doesn't need compute-intensive DRM), small web application hosting, storage controllers, and big data analytics.
"NoSQL and MapReduce are a beautiful fit for these servers because of the ratio of CPU, memory, and disk and the performance per watt," says Freund. ®