Relax, Amazon workers – OpenAI-trained robo hand isn't much use (well, not right now)

Turns out replacing humans isn't that easy after all


Vid Human hands are surprisingly dexterous: they can knit clothes, stuff delivery packages with things, play the piano, and so on, albeit with practice.

Yet if you're worried machines are going to take these pleasures away from us, be assured us mortals can, for now, pick up these skills faster than robots can, judging from the following findings.

Researchers at OpenAI trained, using about a hundred years of simulated experience, a robotic system called Dactyl to rotate and orientate a cube. Dactyl exists not just in its virtual world, though. It can also control a Shadow Dexterous Hand: a metal meathook complete with five fingers, force sensors, and 24 degrees of freedom – pretty close to a human’s 27 degrees of freedom.

Here’s a video of Dactyl in action, virtually and physically. The cube it's told to fondle features a specific letter and color on each of its six faces, and it has to figure out how to manipulate the object so that it finds the requested symbol.

Youtube Video

Over time, it discovered and mastered techniques often used by humans, such as gripping the cube between the thumb and little finger and spinning the cube around with its other fingertips.

The perils of machine learning

What’s most interesting, perhaps, is the way Dactyl was taught. Despite being trained in a simulated world, the software was able to directly transfer what it learned to a real humanoid-like mechanical hand. This is not an easy process.

The trick was to use a method dubbed domain randomization. It’s something other researchers have been exploring for a while to close the simulation-to-reality gap in robotics.

And while OpenAI managed to close that distance, there remained a noticeable gap. The software performed better when controlling a simulated hand, with a median of 50 successes compared to 13 when hooked up to real hardware, according to results published in the team's paper. And by success, they mean "the number of consecutive successful rotations until the object is either dropped, a goal has not been achieved within 80 seconds, or until 50 rotations are achieved."

"Even though randomizations and calibration narrow the reality gap, it still exists and performance on the real system is worse than in simulation," the paper stated.

In other words, in the simulation it did fine – but with the effects of gravity, imperfections in the mechanisms, and other real world effects, the software turned into a butterfingers. Indeed, during testing, the robotic hand broke down dozens of times.

Variables

The machine-learning software was trained in a range of simulated environments where some of the variables such as surface friction, the size of the object, lighting conditions, hand poses, textures, and even the strength of gravity were changed randomly. The idea was to at least attempt to prepare the model for the unpredictable universe in which we live.

“Randomized values are a natural way to represent the uncertainties that we have about the physical system and also prevent overfitting to a single simulated environment," the OpenAI team explained in a blog post on Monday this week.

"If a policy can accomplish the task across all of the simulated environments, it will more likely be able to accomplish it in the real world."

Dactyl racked up so many hours of experience in a such a short time by using Rapid, a system that trains 384 “worker machines” each with 16 CPU cores running a Proximal Policy Optimization (PPO) algorithm. Each worker machine taught itself using a simulation of the Shadow Dexterous Hand in various randomized scenarios.

A general training system

The system is built on two neural networks: one learns to track the cube’s position from images, and the other predicts future rewards for its actions, the goal being to rack up rewards for doing the right thing. PPO thus uses reinforcement learning, and Dactyl learned the best strategies to manipulate the cube by chasing points as it completed tasks – with a five-point bonus for success and a 20 point penalty for failure.

OpenAI's Dota video-game bots were also trained using Rapid and PPO algorithms, albeit using a different architecture and environment with tweaked hyper-parameters.

“After we saw the success of the Dota team with their 1v1 bot, we actually asked them to teach us the ways of Rapid, and we reached parity with our previous learning infrastructure – which we’d spent months building – after only a couple of weeks,” Jonas Schneider, a member of the technical staff at OpenAI, told The Register.

“Still, we were pretty surprised to see that we can even use the exact same optimizer code, and treat Rapid as a black-box optimizer for a simulation problem that’s completely different from the Dota problem it was developed for.”

At the moment, Dactyl can’t do much beyond rotating objects. It can do this with objects other than cubes, such as an octagonal prism, although it struggled more with spheres.

Robot wearing a business suit

US gov quizzes AI experts about when the machines will take over

READ MORE

“The vast majority of robots out there today are at one of two extremes: they can either perform very complex tasks in a constrained setting – think of a factory robot welding together rocket parts – or perform very simple tasks in an unconstrained setting – think of a Roomba,” said Schneider.

“That’s why we specifically chose to perform a very complex task in a setting where we don’t have an entirely accurate model of the hand, since we don’t know how to precisely model effects like friction, rolling, contacts and so on.”

The researchers hope that this will eventually lead to progress in building robots that can cope with our volatile and mutable reality while helping humans with chores at home and at work.

“Eventually we hope that this will lower the cost of programming robots for new tasks, which is very cumbersome and expensive today, as well as allowing to use more complex robots for settings where you might not have an engineering team on hand to carefully program them, like you would in a factory setting,” Schneider concluded. ®

Broader topics


Other stories you might like

  • IT staffing, recruitment biz settles claims it discriminated against Americans
    Foreign workers favored over US residents because that's what clients wanted, allegedly

    Amtex Systems Incorporated, an IT staffing and recruiting firm based in New York City, has agreed to settle claims it discriminated against American workers because company clients wanted workers with temporary visas.

    The US Department of Justice on Wednesday announced the agreement, which followed from a US citizen filing a discrimination complaint with the DoJ's Civil Rights Division’s Immigrant and Employee Rights Section (IER).

    "IT staffing agencies cannot unlawfully exclude applicants or impose additional burdens because of someone’s citizenship or immigration status," said Assistant Attorney General Kristen Clarke of the Justice Department’s Civil Rights Division, in a statement. "The Civil Rights Division is committed to enforcing the law to ensure that job applicants, including US workers, are protected from unlawful discrimination."

    Continue reading
  • Will this be one of the world's first RISC-V laptops?
    A sneak peek at a notebook that could be revealed this year

    Pic As Apple and Qualcomm push for more Arm adoption in the notebook space, we have come across a photo of what could become one of the world's first laptops to use the open-source RISC-V instruction set architecture.

    In an interview with The Register, Calista Redmond, CEO of RISC-V International, signaled we will see a RISC-V laptop revealed sometime this year as the ISA's governing body works to garner more financial and development support from large companies.

    It turns out Philipp Tomsich, chair of RISC-V International's software committee, dangled a photo of what could likely be the laptop in question earlier this month in front of RISC-V Week attendees in Paris.

    Continue reading
  • Did ID.me hoodwink Americans with IRS facial-recognition tech, senators ask
    Biz tells us: Won't someone please think of the ... fraud we've stopped

    Democrat senators want the FTC to investigate "evidence of deceptive statements" made by ID.me regarding the facial-recognition technology it controversially built for Uncle Sam.

    ID.me made headlines this year when the IRS said US taxpayers would have to enroll in the startup's facial-recognition system to access their tax records in the future. After a public backlash, the IRS reconsidered its plans, and said taxpayers could choose non-biometric methods to verify their identity with the agency online.

    Just before the IRS controversy, ID.me said it uses one-to-one face comparisons. "Our one-to-one face match is comparable to taking a selfie to unlock a smartphone. ID.me does not use one-to-many facial recognition, which is more complex and problematic. Further, privacy is core to our mission and we do not sell the personal information of our users," it said in January.

    Continue reading
  • Meet Wizard Spider, the multimillion-dollar gang behind Conti, Ryuk malware
    Russia-linked crime-as-a-service crew is rich, professional – and investing in R&D

    Analysis Wizard Spider, the Russia-linked crew behind high-profile malware Conti, Ryuk and Trickbot, has grown over the past five years into a multimillion-dollar organization that has built a corporate-like operating model, a year-long study has found.

    In a technical report this week, the folks at Prodaft, which has been tracking the cybercrime gang since 2021, outlined its own findings on Wizard Spider, supplemented by info that leaked about the Conti operation in February after the crooks publicly sided with Russia during the illegal invasion of Ukraine.

    What Prodaft found was a gang sitting on assets worth hundreds of millions of dollars funneled from multiple sophisticated malware variants. Wizard Spider, we're told, runs as a business with a complex network of subgroups and teams that target specific types of software, and has associations with other well-known miscreants, including those behind REvil and Qbot (also known as Qakbot or Pinkslipbot).

    Continue reading
  • Supreme Court urged to halt 'unconstitutional' Texas content-no-moderation law
    Everyone's entitled to a viewpoint but what's your viewpoint on what exactly is and isn't a viewpoint?

    A coalition of advocacy groups on Tuesday asked the US Supreme Court to block Texas' social media law HB 20 after the US Fifth Circuit Court of Appeals last week lifted a preliminary injunction that had kept it from taking effect.

    The Lone Star State law, which forbids large social media platforms from moderating content that's "lawful-but-awful," as advocacy group the Center for Democracy and Technology puts it, was approved last September by Governor Greg Abbott (R). It was immediately challenged in court and the judge hearing the case imposed a preliminary injunction, preventing the legislation from being enforced, on the basis that the trade groups opposing it – NetChoice and CCIA – were likely to prevail.

    But that injunction was lifted on appeal. That case continues to be litigated, but thanks to the Fifth Circuit, HB 20 can be enforced even as its constitutionality remains in dispute, hence the coalition's application [PDF] this month to the Supreme Court.

    Continue reading

Biting the hand that feeds IT © 1998–2022