AI won't take coders' jobs. Humans still rule for now
Plus: ML career help, OpenAI releases free speech-recognition model, and more news
In brief AI probably won't replace software engineers, but will dramatically change the way they work in the future especially if they can instruct machines using natural language to generate code.
Several organizations – from OpenAI and Microsoft to Amazon and research labs like DeepMind – have trained neural networks to learn how to code. A recent survey of more than 2,000 developers by GitHub found that the vast majority of respondents found GitHub's Copilot helped increase their productivity since the AI tool can act like a super-autocomplete, helping devs write boilerplate code for programs more quickly.
But will programmers' jobs be taken by machines in the future? "I don't believe AI is anywhere near replacing human developers," Vasi Philomin, Amazon's vice president for AI services, told IEEE Spectrum.
It's possible developers may not need to learn the syntax and vocabulary of programming languages, and instead will need to focus on understanding concepts and systems to design programs while the AI can do all the boring, nitty-gritty coding work, he opined. In other words, you describe how an application works and a machine-learning model outputs the corresponding code to compile or run.
Peter Schrammel, cofounder of Diffblue, a company focused on automating Java code, agreed that programming jobs will change and engineers will be able to focus more on difficult, creative problems.
"Software developers will not lose their jobs because an automation tool replaces them," he said. "There always will be more software that needs to be written."
Private medical images in public AI training dataset
Photographs of people taken in medical settings have been scraped into a public dataset to train text-to-image models, all without consent for that particular use-case.
One artist, who goes by the name of Lapine, was horrified to see that two private images taken for surgical purposes nearly a decade ago are in the LAION-5B dataset used to train popular models such as Stable Diffusion and Google's Imagen. Lapine told Ars Technica she has Dyskeratosis Congenita, a rare genetic condition that impairs bone marrow function and impacts skin tissue.
"It affects everything from my skin to my bones and teeth," she said. "In 2013, I underwent a small set of procedures to restore facial contours after having been through so many rounds of mouth and jaw surgeries. These pictures are from my last set of procedures with this surgeon." Lapine said the surgeon, who stored the medical photos, died in 2018, and somehow the data was obtained, shared online, and downloaded.
Lapine now wants to have her photos removed from the dataset to prevent more models being trained on sensitive, private data. "I would like to have a way for anyone to ask to have their image removed from the data set without sacrificing personal information. Just because they scraped it from the web doesn't mean it was supposed to be public information, or even on the web at all," she said.
OpenAI releases free, open speech recognition model
OpenAI has released an open source neural network named Whisper capable of speech recognition across different languages and accents.
Whisper was trained on a whopping 680,000 hours of audio data scraped from the web. The model splits input data into 30-second chunks to feed into an encoder. A decoder is trained to generate captions for the audio snippet; it is able to identify languages and transcribe speech into English text automatically.
Examples posted by OpenAI show that Whisper can accurately transcribe speech that's fast and jumbled, spoken in a thick Scottish accent, as well as translate clips of Korean pop songs.
"We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing," OpenAI announced. "We hope Whisper's high accuracy and ease of use will allow developers to add voice interfaces to a much wider set of applications."
You can read more about the model here [PDF] and access the code here.
How do we stop AI ripping off our work?
Artists are thinking about how best to protect their work from being ripped off and copied by netizens using AI models. Especially when people are feeding descriptions like, "a summer's afternoon in Times Square, New York City in the style of Rembrandt," into ML software and saving the output.
Established artist Greg Rutkowski's name has been entered as text prompt into art-generating models more than 93,000 times, more than some of the world's most famous artists like Pablo Picasso or Leonardo da Vinci, who have featured in about 2,000 prompts each or less, MIT Technology Review reported. In other words, people are getting AI models to produce artwork that specifically rips off Rutkowski's style not to mention other artists.
Indeed, people playing around with tools like Midjourney or Stable Diffusion can churn out multiple images that look like Rutkowski's epic fantasy-filled digital paintings in seconds. No skill is needed beyond a text description. Artists like Rutkowski are trying to figure out how these text-to-image systems impact his work and livelihood in the future.
Some want to have their work stripped from training datasets so the models can't reproduce their styles, and others believe AI companies should try and form working relationships with museums and artists to better support their work, according to illustrator Karla Ortiz.
"It's not just artists. It's photographers, models, actors and actresses, directors, cinematographers," she said. "Any sort of visual professional is having to deal with this particular question right now."
Cohere For AI Scholars Program
The non-profit research arm of language model startup Cohere has launched a program to recruit engineers, who want to start a career in machine-learning research but have yet to publish any papers.
Candidates don't need to have any specific degrees nor any experience working in academia. Those accepted onto the program will be paired with experts and work remotely investigating a specific problem in natural language processing from January to August 2023, and will receive financial support.
"We designed this program as a way to create more entry points into machine learning and broaden access to world-class research and engineering expertise," Sara Hooker, head of Cohere for AI, told The Register.
"The best and brightest minds in machine learning transcend borders and often follow different paths into research. That's why we're working to fundamentally change where, how, and by whom research is done. This program is a step in that direction."
"Supporting the next generation of aspiring NLP researchers is essential for pioneering new advancements in machine learning. Unfortunately, today there are very few settings to conduct research on cutting-edge NLP problems and limited access to large-scale ML experimental settings. By broadening access to participation in fundamental research — particularly among folks from alternative backgrounds — the Scholars Program aims to change that," she said.
The deadline to apply for the program is November 7. ®