AI bots suck at marking written essays, not too shabby at old Atari games, and more...

The week in AI


Roundup Hello, here's a quick roundup of some announcements from the world of AI this week.

OpenAI researchers reach the highest score yet on the computer game Montezeuma's Revenge through reinforcement learning, DeepMind teaches its bots to play Capture the Flag on Quake III Arena and the US Department of Education are exploring the idea of using AI to mark essays. It's all here in the weekly roundup.AI

Montezuma’s Revenge is back: OpenAI researchers have trained a bot from a single video demonstration to reach the highest score on Montezuma’s Revenge yet.

The classic Atari game is challenging for reinforcement learning algorithms due to the sparse rewards available. Agents have to explore and figure out the best combination moves to execute to collect scores over a longer time compared to other Atari games that are faster paced like Breakout.

So it’s very difficult to teach bots to play the game completely through trial and error and researchers have to train it on human demonstrations instead. OpenAI used its Proximal Policy Optimization algorithm and coaxed its agent to play from similar states already seen in the training video.

“Our approach works by letting each RL episode start from a state in a previously recorded demonstration. Early on in training, the agent begins every episode near the end of the demonstration. Once the agent is able to beat or at least tie the score of the demonstrator on the remaining part of the game in at least 20 per cent of the rollouts, we slowly move the starting point back in time,” it explained in a blog post.

“We keep doing this until the agent is playing from the start of the game, without using the demo at all, at which point we have an RL-trained agent beating or tying the human expert on the entire game.”

The agent reached a score of 74,500. DeepMind also had a crack at Montezuma’s Revenge recently using YouTube videos for training and reached a 41,098.

Now, here’s Capture the Flag: Speaking of DeepMind, researchers across the pond have taught a team of bots to play the old Quake III Arena game in Capture the Flag mode.

In the game, players play in two teams. The goal is to take the other team’s flag whilst also protecting your own flag too. Players can chase after opponents in order to tag them and send them back to their spawning point. The team who has captured the most flags within five minutes wins.

DeepMind hosted informal matches with 40 human players, who were split into teams containing bots as both teammates and enemies. The researchers found that teams with bots helped “exceed the win-rate of the human players,” and were seen as being more collaborative than human players.

Instead of focusing on a single bot, the researchers trained a population of agents to play with each other. Each agent learns its own reward signal and can generate its own goal, whether that be capturing a flag or protecting its own.

Dubbed the For The Win agent it reaches high performance levels, beating human players and other reinforcement learning methods after playing more than 150,000 training games. Agents have the advantage of lightning speed reactions so they were faster at tagging opponents, but they also learnt certain strategies like following teammates around the map or camping near the opponent’s territory.

You can read more about it here.

Intel and Baidu working together: Intel announced a range of collaborations with Baidu during the Baidu Create conference in Beijing this week.

It includes:

  • Xeye - a camera aimed at retailers who want to analyse objects and detect people using facial recognition. It uses Intel Movidius’ vision processing units chips.
  • Baidu Cloud and FPGAs - Baidu Cloud users can now access Intel’s FPGAs to handle AI workloads.
  • Paddle Paddle and Xeons - Baidu’s AI framework Paddle Paddle now supports Intel’s Xeon Scalable processors.

“From enabling in-device intelligence, to providing data center scale on Intel Xeon Scalable processors, to accelerating workloads with Intel FPGAs, to making it simpler for PaddlePaddle developers to code across platforms, Baidu is taking advantage of Intel’s products and expertise to bring its latest AI advancements to life,” said Gadi Singer, vice president and architecture general manager at Intel’s Artificial Intelligence Products Group.

Robo-graders get an F: The thought of AI marking exam essays should ring alarm bells, but apparently the US Department of Education are thinking about doing exactly that.

Machines have been offered as a solution to sniff out fake news or moderate hateful internet comments, but it never works because they are bloody awful at actually understanding content. Just look at the so-called “smart” digital assistants like Siri, Google Home, or Amazon’s Alexa.

For some reason this doesn’t phase people at the Department of Education, according to a recent NPR clip.

“Department Of Education Deputy Commissioner Jeff Wulfson cited "huge advances in artificial intelligence in the last few years" and cracked, "I asked Alexa whether she thought we'd ever be able to use computers to reliably score tests, and she said absolutely." Oh dear.

Luckily, teachers have been pushing back arguing that machines marking will be rigid, ignoring the creativity and expression.

Here’s a short paragraph that gets top marks from an algorithm.

"History by mimic has not, and presumably never will be precipitously but blithely ensconced. Society will always encompass imaginativeness; many of scrutinizations but a few for an amanuensis. The perjured imaginativeness lies in the area of theory of knowledge but also the field of literature. Instead of enthralling the analysis, grounds constitutes both a disparaging quip and a diligent explanation."

Mind you, that text has been generated by an algorithm too. Known as Babel (Basic Automatic B.S. Essay Language), it creates sentences peppered with very impressive sounding words, and even includes a comma or two now and again and full stops at the end of sentences. But it’s complete gobbledygook and doesn’t mean a thing. By those standards El Reg articles would probably score a big fat 0.

Things might not be so bad if a robo-grader is paired with a fact checker and a human reader. But there are still issues around what signals to look out for in order to flag a human reader. ®

Broader topics


Other stories you might like

  • IT staffing, recruitment biz settles claims it discriminated against Americans
    Foreign workers favored over US residents because that's what clients wanted, allegedly

    Amtex Systems Incorporated, an IT staffing and recruiting firm based in New York City, has agreed to settle claims it discriminated against American workers because company clients wanted workers with temporary visas.

    The US Department of Justice on Wednesday announced the agreement, which followed from a US citizen filing a discrimination complaint with the DoJ's Civil Rights Division’s Immigrant and Employee Rights Section (IER).

    "IT staffing agencies cannot unlawfully exclude applicants or impose additional burdens because of someone’s citizenship or immigration status," said Assistant Attorney General Kristen Clarke of the Justice Department’s Civil Rights Division, in a statement. "The Civil Rights Division is committed to enforcing the law to ensure that job applicants, including US workers, are protected from unlawful discrimination."

    Continue reading
  • Will this be one of the world's first RISC-V laptops?
    A sneak peek at a notebook that could be revealed this year

    Pic As Apple and Qualcomm push for more Arm adoption in the notebook space, we have come across a photo of what could become one of the world's first laptops to use the open-source RISC-V instruction set architecture.

    In an interview with The Register, Calista Redmond, CEO of RISC-V International, signaled we will see a RISC-V laptop revealed sometime this year as the ISA's governing body works to garner more financial and development support from large companies.

    It turns out Philipp Tomsich, chair of RISC-V International's software committee, dangled a photo of what could likely be the laptop in question earlier this month in front of RISC-V Week attendees in Paris.

    Continue reading
  • Did ID.me hoodwink Americans with IRS facial-recognition tech, senators ask
    Biz tells us: Won't someone please think of the ... fraud we've stopped

    Democrat senators want the FTC to investigate "evidence of deceptive statements" made by ID.me regarding the facial-recognition technology it controversially built for Uncle Sam.

    ID.me made headlines this year when the IRS said US taxpayers would have to enroll in the startup's facial-recognition system to access their tax records in the future. After a public backlash, the IRS reconsidered its plans, and said taxpayers could choose non-biometric methods to verify their identity with the agency online.

    Just before the IRS controversy, ID.me said it uses one-to-one face comparisons. "Our one-to-one face match is comparable to taking a selfie to unlock a smartphone. ID.me does not use one-to-many facial recognition, which is more complex and problematic. Further, privacy is core to our mission and we do not sell the personal information of our users," it said in January.

    Continue reading
  • Meet Wizard Spider, the multimillion-dollar gang behind Conti, Ryuk malware
    Russia-linked crime-as-a-service crew is rich, professional – and investing in R&D

    Analysis Wizard Spider, the Russia-linked crew behind high-profile malware Conti, Ryuk and Trickbot, has grown over the past five years into a multimillion-dollar organization that has built a corporate-like operating model, a year-long study has found.

    In a technical report this week, the folks at Prodaft, which has been tracking the cybercrime gang since 2021, outlined its own findings on Wizard Spider, supplemented by info that leaked about the Conti operation in February after the crooks publicly sided with Russia during the illegal invasion of Ukraine.

    What Prodaft found was a gang sitting on assets worth hundreds of millions of dollars funneled from multiple sophisticated malware variants. Wizard Spider, we're told, runs as a business with a complex network of subgroups and teams that target specific types of software, and has associations with other well-known miscreants, including those behind REvil and Qbot (also known as Qakbot or Pinkslipbot).

    Continue reading
  • Supreme Court urged to halt 'unconstitutional' Texas content-no-moderation law
    Everyone's entitled to a viewpoint but what's your viewpoint on what exactly is and isn't a viewpoint?

    A coalition of advocacy groups on Tuesday asked the US Supreme Court to block Texas' social media law HB 20 after the US Fifth Circuit Court of Appeals last week lifted a preliminary injunction that had kept it from taking effect.

    The Lone Star State law, which forbids large social media platforms from moderating content that's "lawful-but-awful," as advocacy group the Center for Democracy and Technology puts it, was approved last September by Governor Greg Abbott (R). It was immediately challenged in court and the judge hearing the case imposed a preliminary injunction, preventing the legislation from being enforced, on the basis that the trade groups opposing it – NetChoice and CCIA – were likely to prevail.

    But that injunction was lifted on appeal. That case continues to be litigated, but thanks to the Fifth Circuit, HB 20 can be enforced even as its constitutionality remains in dispute, hence the coalition's application [PDF] this month to the Supreme Court.

    Continue reading

Biting the hand that feeds IT © 1998–2022