This article is more than 1 year old

Publisher halts AI article assembly line after probe

Plus: Academics debate citing ChatGPT as research co-author; Getty Images sues Stability AI

In brief Consumer tech publisher CNET will pause publishing stories written with the help of AI software, after it was criticized for failing to catch errors in copy generated by machines.

Executives at the outlet said in a call that it would pause publishing AI-assisted articles – for now, according to The Verge.

This comes soon after the website launched a review into its machine-suggested content when it emerged the pieces were factually challenged.

"We didn't do it in secret. We did it quietly," CNET editor-in-chief Connie Guglielmo is quoted as telling staff. The AI engine CNET used was reportedly built by its owner, Red Ventures, and is proprietary.

As well as AI models, the news outlet uses other software to auto-fill information from reports and sources to write stories.

"Some writers – I won't call them reporters – have conflated these two things and had caused confusion and have said that using a tool to insert numbers into interest rate or stock price stories is somehow part of some, I don't know, devious enterprise," Guglielmo said. "I'm sure that's news to The Wall Street Journal, Bloomberg, The New York Times, Forbes, and everyone else who does that and has been doing it for a very, very long time."

CNET began using AI to help write stories for its Money section last year in November. It has not published a new article generated by software since January 13.

Some researchers include ChatGPT as author on papers

Academics are turning to AI software like ChatGPT to help write their papers, prompting journal publishers and other researchers to ask: Should AI be credited as an author?

Large language models (LLMs) trained on data scraped from the internet can be instructed to generate long passages of coherent text, even on technical topics. Tools like ChatGPT that employ LLMs have therefore come to be seen as a route to faster first drafts.

It's no surprise researchers are now using LLM-based tools to write academic papers. At least four studies have listed ChatGPT as authors already, according to Nature. Some believe machines deserve to be credited, whilst others don't believe it's appropriate. 

"We need to distinguish the formal role of an author of a scholarly manuscript from the more general notion of an author as the writer of a document," said Richard Sever, co-founder of  bioRxiv and medRxiv, two websites hosting pre-print science papers, and assistant director of Cold Spring Harbor Laboratory press in New York.

Sever argues that only humans should be listed as authors since they are legally responsible for their own work. Leaders from top science journals Nature and Science were also not in favor of crediting AI-writing tools. "An attribution of authorship carries with it accountability for the work, which cannot be effectively applied to [large language models]," said Magdalena Skipper, editor-in-chief of Nature in London.

"We would not allow AI to be listed as an author on a paper we published, and use of AI-generated text without proper citation could be considered plagiarism," added Holden Thorp, editor-in-chief of Science

Stability AI hit with second lawsuit – this time from Getty

Getty Images sued Stability AI, alleging the London-based startup has infringed on its intellectual property rights by unlawfully scraping copyrighted images from its website to train an image-generation tool.

"It is Getty Images' position that Stability AI unlawfully copied and processed millions of images protected by copyright and the associated metadata owned or represented by Getty Images absent a license to benefit Stability AI's commercial interests and to the detriment of the content creators," Getty said in a January 17th statement. 

Getty isn't totally against text-to-image software – indeed it sells automated digital artwork on its platform. Rather, the stock image biz is annoyed Stability AI didn't ask for explicit permission and pay for its content. Getty has entered into licensing agreements with tech companies, giving them access to images for training models in a way it believes respects intellectual property rights.

Stability AI, however, did not attempt to obtain a license and instead "chose to ignore viable licensing options and legal protections in pursuit of its own commercial interests", Getty claimed. The complaint, filed in the High Court of Justice in London, is the second lawsuit against Stability AI. Three artists launched a class-action lawsuit accusing the company of infringing on people's copyrights to create its Stable Diffusion software last week.

Anthropic's Claude vs OpenAI's ChatGPT

AI safety startup Anthropic has released its large language model chatbot Claude to a limited number of people for testing.

Engineers at the data-labeling company Scale decided to pit it against OpenAI's ChatGPT, comparing their ability to generate code, solve arithmetic problems, and even answer riddles. 

Claude is similar to ChatGPT and was also trained on large volumes of text scraped from the internet. It uses reinforcement learning to rank generated responses. OpenAI uses humans to label good and bad responses, whilst Anthropic instead uses an automated process. 

"Overall, Claude is a serious competitor to ChatGPT, with improvements in many areas," Scale's engineers wrote in a blog post. "While conceived as a demonstration of "constitutional" principles, Claude feels not only safer but more fun than ChatGPT. Claude's writing is more verbose, but also more naturalistic. Its ability to write coherently about itself, its limitations, and its goals seems to allow it to answer questions more naturally on other subjects."

"For tasks like code generation or reasoning about code, Claude appears to be worse. Its code generations seem to contain more bugs and errors. For other tasks, like calculation and reasoning through logic problems, Claude and ChatGPT appear broadly similar."

In short, AI language models still struggle with the same old issues: They have very little memory, and tend to include errors in the text they produce. ®

More about

TIP US OFF

Send us news


Other stories you might like