Coronavirus masks are thwarting facial recognition systems. So, of course, people are building training sets from your lockdown-wear selfies

Plus: A peek inside Nvidia's Ampere architecture, and more

29 Reg comments Got Tips?

Roundup Here's your summary of recent artificial intelligence bits and bytes, and related hardware and software.

People’s face-mask selfies scraped from the internet: Face masks are a common sight right now, depending on where you live, as people slip them on to curb the spread of the COVID-19 coronavirus. That's a problem for facial-recognition systems, which struggle to identify people with most of their faces obscured.

One potential way to get round this limitation is to train the software using pictures of people wearing face masks, so that the algorithms can identify people even when wearing masks. To do this, you need to collect a large training data set of folks donning lockdown-wear, and feed it into your neural networks to improve them. And where’s the best place to find such images? People’s selfies posted on the internet.

Some datasets have popped up on GitHub either linking to photos on social networks, such as Instagram, or containing scraped images, CNET found. Although the netizens' pictures were publicly shared, it’s often against a platform’s terms of service to download and repurpose the images. In fact, all the links to the Instagram pictures in the COVID19 Mask Image Dataset have been expired by the social network, so they no longer work.

However, the Real-World Masked Face Dataset managed by computer science researchers at China's Wuhan University is still available for download, and contains actual photos of people.

Unfortunately, it’s difficult to prevent your selfies being scraped from public sources. The best way to reduce the chances is to keep your social media accounts private.

A quick peek inside Nvidia's Ampere architecture: Nvidia this month launched the A100, its most powerful graphics processor yet, which uses its new Ampere architecture.

Here’s a diagram of what the A100 looks like inside: we're told it has 54.2 billion 7nm-process transistors on an 826mm2 die fabricated by TSMC. There are 64 CUDA execution cores per streaming microprocessor (SM) on the silicon, totaling 6,912 cores in 108 SMs.


Click to enlarge any diagram ... Source: Nvidia

Compared to the previous-generation Volta V100 GPU, which had 80 SMs and 5,120 CUDA cores – that's still 64 cores per SM – the A100 can crunch its way through more data per second, and its larger 40MB L2 cache helps keep the pressure off its RAM bandwidth. Having said that, the graphics processor can shift 1.56TB per second to and from its in-package HBM2 RAM.

“The Ampere architecture is the first elastic GPU architecture, which enables building versatile and high throughput data centers,” said Krashinsky. By elastic, he is referring to the chip's ability to act as one giant GPU by connecting eight A100s together, and that each A100 can be split into seven virtual instances that run independently. A dozen NVLink interconnects link up the A100s with a total bandwidth of 600GB per second.

Each SM contains, according to Nvidia, "third-generation Tensor Cores that each perform 256 FP16/FP32 FMA operations per clock," and we're told there are "four Tensor Cores per SM, which together deliver 1024 dense FP16/FP32 FMA operations per clock." This circuitry supports bfloat16 for the first time, too.

Below is a table of the chip's various engines performance measured in TOPs – that's a trillion operations per second. It has also improved its ability to handle sparse arrays.

If you’re looking for the ultimate deep dive into the Ampere architecture, here’s the whitepaper [PDF] to get stuck into. Also, Nvidia reported its fiscal Q1 2021 financial figures, as we reported here.

Google says it won’t build AI tools that will unlock fossil fuels: The Chocolate Factory declared it would not develop custom machine-learning algorithms that will help the oil and gas industry extract fossil fuels.

That’s what a spokesperson told OneZero after a Greenpeace report revealed contracts between 14 tech companies and oil giants such as Exxon, Chevron, and Total. Going green, or rather appearing to go green, is a trendy way for mega-corps to demonstrate they care about the planet and its populations. Microsoft and Amazon have pledged millions of dollars to climate research. Now, Google has promised it won’t develop specialized tools for the purpose of extracting oil and gas.

Now, that may seem all well and good, as ex-Googler Meredith Whitaker pointed out, the word “custom” is tricky. It doesn’t mean that some of its existing algorithms can’t be used by the oil and gas industry.

AWS online machine learning scholarship: Taking classes online is all the rage now as everyone stays home during the coronavirus pandemic. Amazon has teamed up with Udacity, an online learning platform, to create the AWS Machine Learning Scholarship Program available for enrollment now.

The online course lasts two and a half months, and is aimed at programmers who want to learn how to be an AI engineer. It’ll teach you everything from how to build machine-learning algorithms to how to deploy them in AWS services, such as Amazon SageMaker, we're told.

The program runs from May 19 through to July 31, you can enroll at any point, and it’s all free. The top 325 students will get the chance to take the Udacity Machine Learning Engineer Nanodegree program at no extra cost to learn more advanced machine learning techniques.

If that sounds like something you’d enjoy, sign up here. ®


Biting the hand that feeds IT © 1998–2020