Software

AI + ML

Don't scrape the faces of our citizens for recognition, Canada tells Clearview AI – delete those images

Plus: Check if your Flickr photos are in facial recognition engines and and the list of NSFW words for AI


Canada’s privacy watchdog has found Clearview AI in “clear violation” of the country’s privacy laws, and has told the facial-recognition startup to stop scraping images of Canadians and delete all existing photos it has on those citizens.

The Office of the Privacy Commissioner of Canada launched an official investigation into the upstart’s practices, and as a result Clearview stopped selling its software to Canadian police.

“Clearview's massive collection of millions of images without the consent or knowledge of individuals for the purpose of marketing facial recognition services does not comply with Quebec's privacy or biometric legislation,” said Diane Poitras, President of the Quebec Commission on Access to Information, a government organization involved in the investigation.

The startup was told to stop taking people’s photos to train its facial-recognition software, delete all the ones it has collected from people in Canada, and to not sell its services to any Canadian customers. New-York-based Clearview, however, argued that it does not have a “real and substantial connection” to the country so shouldn’t need to abide by its laws, and that consent was not needed to scrape the photos since they’re all publicly available anyway.

Have your Flickr photos been used to train a facial recognition model?

AI researchers have built an online tool that allows people to check if their selfies have been used to secretly train facial-recognition software.

Exposing.ai – built by developer and artist Adam Harvey, and Liz O’Sullivan, technology director at privacy rights group the Surveillance Technology Oversight Project – looked through AI training datasets built from scraping creative-commons-licensed photos on photo-sharing site Flickr. They tracked down the URL for each photo and put it into a database, and users can look through the data by searching for a specific URL, image hashtag, or Flickr username.

If there’s a hit, then the image is present in at least one of the six datasets used to teach machines how to identify faces. “People need to realize that some of their most intimate moments have been weaponized,” O’Sullivan told the NYT. “The potential for harm seemed too great.”

You can use the tool here.

The List of Dirty, Naughty, Obscene, and Otherwise Bad Words AI researchers use to filter data

The best way to prevent machine-learning models from generating any text or images that are too racy and lewd is to not train the software on data that is, well, too racy or lewd.

One way that researchers do this is by automatically screening any data that contains or is related to x-rated subject areas that they want their models to avoid. Enter the List of Dirty, Naughty, Obscene, and Otherwise Bad Words, known as LDNOOBW, a handy checklist containing indecent words, and now shared on GitHub.

Created first by folks over at Shutterstock, the stock image biz, the list contains hundreds of words in numerous languages so far, and is now employed by other tech companies like Slack and Google, Wired reported.

Colossal Clean Crawled Corpus, the popular text dataset used to train large language models, uses LDNOOBW to filter out webpages containing those words. The idea is that words like ‘busty’ or ‘kinky’ are more likely to be associated with pornographic sites and are blocked from the training data. But some critics believe censoring some words means that these algorithms will have no knowledge of some human sexualities that are traditionally underrepresented.

Do you need an AI algo to help you code at work?

Kite, a startup focused on building autocomplete tools for programmers using machine learning, now has support specifically for developers on the job. Companies can now pay for an enterprise license to use the software at work, in other words.

It costs $40 per user per month, $10 more than its llicense for individuals. Students are allowed to use it for free.

The enterprise version, known as Kite Team Server, is more powerful and runs on GPU servers rather than CPU ones. The software can also be trained on a company’s proprietary codebase to come up with suggestions based on custom code.

CEO Adam Smith, told The Register, that people’s code is always kept private.

“Kite Team Server custom-trains ML models on a GPU behind the company's firewall. Kite Team Server ensures code stays private and secure by keeping it behind the firewall.” None of the inputs and outputs generated by its tools are stored on its servers or shared.

You can read more about it here. ®

Send us news
32 Comments

China says its first Mars rover Zhurong has landed on the Red Planet

'An important step in our country’s interstellar exploration journey' – state media

Updated China's Zhurong rover today touched down on Mars from the Tianwen-1 orbiter, the nation's state media says.

We're told the machine will take carry out self-tests, and try to move itself to explore the Red Planet's surface.

"On May 15, our country’s first Mars exploration mission, Tianwen-1, landed in a pre-selected landing zone in the southern Utopia Planitia of Mars, leaving a Chinese footprint on Mars for the first time. It marks an important step in our country’s interstellar exploration journey," Xinhua reported at 0837 in Beijing (1737 PT, 0037 UTC).

Continue reading

Google leads Big Tech effort to ensure H-1B spouses can continue working in America

Coalition of 41 organizations oppose labor rule challenge

Google is spearheading an effort to save a visa rule that allows the spouses of H-1B visa holders awaiting green cards to work in the US.

On Friday, Google and 40 other companies and organizations filed an amicus brief supporting the Department of Homeland Security's (DHS) H-4 employment authorization document (H-4 EAD) program, which faces a legal challenge by a group called Save Jobs USA.

Save Jobs USA, an association representing Southern California Edison workers who claim they lost their jobs to H-1B visa holders, is suing DHS in a Washington, DC court to undo the rule.

Continue reading

AMD promises to spend $1.6bn on 12nm, 14nm chips from GlobalFoundries

Also wriggles out of exclusivity deal

Amid fears the global semiconductor crisis may last until 2023, AMD has opted to extend its purchase agreement with GlobalFoundries, giving it access to a greater proportion of the fabricator's output.

AMD disclosed the existence of the deal in an 8-K regulatory filing submitted to the SEC earlier this week. The company has committed to buy $1.6bn worth of 12nm and 14nm node silicon wafers between now and December 31, 2024. It did not disclose a breakdown of the costs nor the exact quantity of output it had secured.

Should AMD fail to meet its purchase obligation, it has committed to pay GlobalFoundries a portion of the difference between its planned and actual spend. AMD has also agreed to pre-pay for an unspecified portion of these wafers in advance.

Continue reading

Audacity's new management hits rewind on telemetry plans following community outrage

Sorry for trying to add it or sorry for cocking up the comms?

Amid the smell of burning rubber, the new managers of open-source audio editor Audacity have announced a U-turn on plans to introduce "basic telemetry" into the product.

Audacity pitched up under the umbrella of Muse Group earlier this month and professed itself to be both "scared and excited."

Mere days later, an impressive number of users went for the former option and expressed alarm at a GitHub request introducing "basic telemetry."

Continue reading

Apple's expert witness grilled by Epic over 'frictionless' spending outside the app

How easy would it be for customers to depart the walled garden, legal eagles ask economist

Epic Games' lawyers had a chance to put Apple's expert witness through the wringer in the latest from its California bench trial.

Counsel for Apple called to the stand Lorin Hitt, an academic from the prestigious Wharton Business School in Pennsylvania.

Hitt – who had been selected as expert witness for Apple – questioned whether iOS was as effective at locking in users as previously claimed, citing a 26 per cent switch rate. He also debated whether users remained loyal to a platform because of switching costs, or because they simply like it.

Continue reading

Facebook Giphy merger stays on ice after failed challenge to UK competition regulator

Problem was of social network's own making, says unimpressed judge

Facebook has failed to neutralise an order from Britain's competition regulator freezing its buyout of Giphy after having "sat on its hands" and failed to answer questions, the Court of Appeal has found.

Judge Sir Geoffrey Vos said "the central problem in this case was entirely of Facebook's own making" as he dismissed its attempt to overturn an Initial Enforcement Order (IEO) made by Britain's Competition and Markets Authority (CMA) last year.

That IEO blocked the Mark Zuckerberg-owned social network from finishing off its $400m buyout of Giphy, a supplier of web tracking beacons cunningly disguised as funny little animated images used to spice up online chats and comment sections.

Continue reading

10.8 million UK homes now have access to gigabit-capable broadband, with much of the legwork done by Virgin Media

That's 37% of the country covered, and BT is expected to pick up the pace too

A new Ofcom report shows the number of UK homes with access to gigabit-capable broadband hit 10.8 million in January, representing 37 per cent of households.

The figures were part of Ofcom's Interim Connected Nations report [PDF] and covered September 2020 to January 2021.

Overall, the number of gigabit-capable lines increased by 37 per cent against August's figure [PDF] of 7.9 million.

Continue reading

Tor users, beware: 'Scheme flooding' technique may be used to deanonymize you

By probing for installed apps with custom URL schemes, it's possible to build a 32-bit unique fingerprint

FingerprintJS, maker of a browser-fingerprinting library for fraud prevention, on Thursday said it has identified a more dubious fingerprinting technique capable of generating a consistent identifier across different desktop browsers, including the Tor Browser.

That means, for example, if you browse the web using Safari, Firefox, or Chrome for some websites, and use the Tor browser to anonymously view others, there is a possibility someone could link your browser histories across all those sessions using a unique identifier, potentially deanonymize you, and track you around the web.

Doing this is non-trivial, it can be very inaccurate or unreliable, and so this is more of a heads up than anything else.

Continue reading

NASA pops old-school worm logo onto Orion spacecraft

Will be visible from the launchpad ... when it finally gets there

NASA has slapped its worm logo on the side of the Crew Module Adaptor (CMA) for the Orion spacecraft as the first Artemis mission to the Moon inches closer.

The logo had already been stuck on the underside of the CMA last year, but sticking it on the side will ensure it is visible once the Orion spacecraft and its European-built service module are stacked atop the Space Launch System (SLS) rocket and wheeled out to Kennedy's pad 39B.

Continue reading

Hospitals cancel outpatient appointments as Irish health service struck by ransomware

Russia-based criminals pick soft target in hope of easy gains

Ireland's nationalised health service has shut down its IT systems following a "human-operated" Conti ransomware attack, causing a Dublin hospital to cancel outpatient appointments.

The country's Health Service Executive closed its systems down as a precaution, local reports from the Irish public service broadcaster RTÉ said, reporting that Dublin's Rotunda Hospital had cancelled appointments for outpatients – including many for pregnant women.

"The maternity hospital said all outpatient visits are cancelled - unless expectant mothers are 36 weeks pregnant or later," reported RTÉ, adding: "All gynaecology clinics are also cancelled today."

Continue reading

Rapping otters and automated database knob-twiddling: An obvious combination in some universe or other

OtterTune to compete with Oracle automation, but also for open source databases

A university spin-out startup has announced a private beta of an automated database tuning service which its founder claims can double the performance or halve the cost of the popular AWS Relational Database Service.

Among its marketing hype, though, is the, erm, novel approach of launching a hip-hop album of beats and screeching otters. More of that later.

Originating from a project at Carnegie Mellon Database Group, OtterTune is based on the idea you can use machine learning to identify the optimal setting for database parameter knobs, a task well beyond most developers and something even seasoned DBAs struggle with, given the number of databases on the market that they might be required to manage.

Continue reading