This article is more than 1 year old
Would you like a side of data with your chips? Silicon-slingers start bundling info with their hardware
Intel, Nvidia, and others have figured out you need help getting started with AI
Some chip makers are starting to supply data as a value-added service.
Silicon-slingers, especially those with products for AI, know that their products are worthless without information from which knowledge can be extracted; it's like selling an empty brain.
As a result, they’re building or acquiring data sets to make their products more attractive.
Lattice Semiconductor last month made an under-the-radar acquisition of an outfit named Mirametrix, which has AI assets that include eye-tracking and gaze detection data.
Ownership of Mirametrix's datasets could open more doors for Lattice -- which primarily sells FPGAs -- into fast-growing industries like automotive and wellness. The company isn't selling the data set, but instead algorithms that car companies could use to build systems that monitor drivers and alerts them if they doze off or are not paying attention to driving.
"With that data, we can actually provide a better experience and do things that other people will find very difficult to do," Esam Elashmawi, Lattice's chief strategy and marketing officer, told The Register.
Elashmawi added that the data isn't tied to Lattice’s FPGAs, but can be used with chips made by other companies. Mirametrix technology has already been used by companies like Lenovo in laptops.
"We are looking for ways to... and differentiate for a better application experience for our customers. Sometimes it is a data set," Elashmawi said.
- National Cyber Strategy will lead to BritChip for mobile devices by 2025, claims UK.gov
- Pentagon wants to drive digital and AI onto the battlefield
- Road to nowhere: UK plans for an 'AI assurance industry' but destination is unclear
Intel's Mobileye unit, which will soon go public, relies on proprietary data sets describing road conditions to sell more cameras and sensors to car makers. In a pre-listing filing, the company asserted "we believe that no other company in the world has road experience datasets as deep and as wide as ours."
Mobileye has collected information from its existing sensors and cameras in various models of cars, which have driven millions of miles in 40 countries. The crowdsourced information gleans select data points from images, which is sent back to Mobileye at low bandwidth of 10KB per car per kilometer. The harvested information helps car makers customise autonomous vehicles to unique road conditions found around the world.
Nvidia is creating data sets by simulating real-world situations in the “metaverse” , based on camera views and lighting to train robots to navigate. The company has also created data sets on realistic scenes through simulated cameras for autonomous driving. A gold star if you guessed the simulations run best on Nvidia GPUs.
Good data sets are rare and valuable and can be set industry agendas, Andrew Feldman, CEO AI acceleration vendor Cerebras Systems, told The Register . Cerebras last month raised $250 million in its latest round of funding on a $4bn valuation.
"The collection, management, and mining of data is a source of competitive differentiation. Data sets are expensive to gather, clean and curate," Feldman said.
"If everyone in an ecosystem is using one dataset, and that data set provides an advantage to some and disadvantage to others, these advantages can get embedded in the foundation of the ecosystem," Feldman said.
Cerberas’ company's WSE-2 AI megachip, which is the size of a wafer and has 850,000 cores, is agnostic by design and can run a variety of AI applications simultaneously.
Data is becoming the commodity that helps chip makers lock customers to their hardware. But AI isn't one-size-fits-all, and customization can extract the best performance and power efficiency from AI models for tasks like optimising electric cars running on batteries.
"Algorithms are getting more and more complex and taking up more memory and storage space—hard enough to deal with—but then they need large amounts of data to be trained for the tasks they will perform," Shane Rau, research vice president at IDC, told The Register.
Beyond proprietary data sets, chip makers are contributing to open-source data sets. Intel and Nvidia have contributed to People’s Speech, an open-source English speech recognition dataset, which is available for free under the Creative Commons license. ®