Hot Chips Nvidia is channelling Thunderbirds legend Parker for its latest system-on-chip for self-driving cars.
Two Parker processors are used in the PX 2 box we saw at the start of the year. This hardware adds a super-cruise-control to vehicles by hoovering up video feeds from onboard cameras and other sensor data, feeding it through an artificial intelligence model, and constantly making decisions on speed, lane positioning and so on.
This allows the car to drive itself, provided the model is trained appropriately for its surroundings. Today's deep-learning systems require a shedload of data to be able to spot patterns, recognize objects and comprehend situations from their surroundings. Producing a box like the PX 2 is only half or a third of the job. Giving it a well-trained model that understands construction works, signs, intersections and complex street junctions, not just highways, is crucial if you want truly autonomous vehicles.
We went into detail about the Pascal architecture here. Now let's look at the Tegra system-on-chip codenamed Parker, which shares the same name as International Rescue's loyal chauffeur and butler.
The 16nm FinFET SoC has 256 Pascal CUDA GPU cores and six 64-bit ARMv8-compatible general purpose cores: two of Nvidia's homegrown Denver 2 cores and four stock Cortex-A57 cores. All six talk to each other coherently with a 2MB L2 cache for the Denver cores and a 2MB L2 cache for the A57s. The chip is fabricated by TSMC, and can perform 1.5TFLOPS on 16-bit floating-point values, according to Nvidia.
It also includes cryptography engines, a 60FPS 4K video decoder and encoder, audio output, a 2D graphics renderer, 128-bit 50GB/s low-power DDR4 interfaces with ECC, and an image processor. You can basically hook it up to 12 road-watching cameras that are spread around the vehicle. There are also interfaces for flash memory card storage, SATA, QSPI and the standard CAN bus. The two CAN interfaces conform to ISO 11898-1.
As we'll see later, there is a heavy focus on virtualization: Parker uses ARM's hardware virtualization to put the various bits of software controlling the car into secure sandboxes – in theory.
There's also a safety management engine, which is a fancy watchdog. It's a separate dual processor that runs in lockstep to detect internal hardware errors. It runs a realtime operating system and monitors the onboard computer for faults, reporting them as they happen and running recovery code where possible. It picks up on warnings from the error-correcting RAM, and blacklists unreliable areas of DRAM. On-die memories also have ECC and parity protection.
We're told Parker is ISO 26262 compliant.
Below shows how the CPUs fit together, with 128KB instruction and 64KB data caches per Denver core and 48KB instruction and 32KB data caches per A57. The interconnect allows the Denvers and Cortexes to work together heterogeneously and the operating system to migrate threads and processes to the different cores depending on the workload levels.
The hardware virtualization supports up to eight virtual machine environments. Each virtual machine controls its own display pipeline. The CUDA cores, networking hardware, and audio and DMA circuitry are also virtualized, to prevent an unprivileged VM from screwing over the system. Of course, all hypervisors have bugs, allowing code to escape their confines – let's hope this doesn't happen here.
Finally, here's how it all fits together: your blueprint for designing and hacking a self-driving Nvidia-powered car. Parker drives dashboard displays and infotainment panels via two discrete PCIe-connected GPUs. The SoC's CAN lines are wired straight into the vehicle's communications bus so it can control the engine and other system. You can also connect to it via USB 3.0. You can even reach the chip via 10Gbps Ethernet.
Earlier, Nvidia said Parker includes a gigabit Ethernet controller to hoof audio and video around the vehicle, but in the diagram here the Ethernet is provided by a PCIe-connected Intel chip and video is piped via PCIe.
As with these other Nvidia-supplied slides, which we obtained at the Hot Chips conference this week in California, you can click on it to enlarge.
A PX 2 box with two Parker SoCs delivers 8TFLOPS of performance, or 24 trillion deep-learning operations per second, we're told. Some 80 computer-science research centers, automakers and other tech organizations are right now busy playing with the gear. Volvo plans to roll out the PX 2 in its XC90 SUVs next year. ®