This article is more than 1 year old
AI conference and NYC's educators ban papers done by ChatGPT
Is machine text plagiarism?
Officials chairing this year's International Conference on Machine Learning event have banned academics from submitting papers containing text generated by large language models and tools like ChatGPT, and they aren't alone in banning such material.
Text-based generative language models have improved and become widely accessible since OpenAI's first commercial system GPT-3 released in 2020. There are now several types of products available that are adapted to different styles of writing. People are increasingly using them to generate code, essays, or papers for school or work, prompting organizations like ICML to ban machine-written text.
"Papers that include text generated from a large-scale language model (LLM) such as ChatGPT are prohibited unless the produced text is presented as a part of the paper's experimental analysis," the programme's chairs for this year's conference announced in a statement, this week.
Academics are allowed to use AI, however, to polish their own writing, meaning they can input their own text into a model and prompt it into editing their work to improve its style or grammar. Academics leading this year's ICML conference said they decided to prohibit AI-generated paper submissions to guard against issues like plagiarism, but the policy isn't set in stone, and may change in the future.
ICML isn't the only organization to ban AI-generated papers. New York City's education department has blocked students from accessing ChatGPT using public school networks.
"Due to concerns about negative impacts on student learning, and concerns regarding the safety and accuracy of content, access to ChatGPT is restricted on New York City Public Schools' networks and devices," Jenna Lyle, the department spokesperson, said in a statement to Chalkbeat, this week.
"While the tool may be able to provide quick and easy answers to questions, it does not build critical-thinking and problem-solving skills, which are essential for academic and lifelong success," she warned.
The coming storm
Language models like ChatGPT are trained on text scraped from the internet. They learn to pick up common patterns between words to predict what to write next given a text-based instruction or prompt. Whether these systems plagiarize authors or not is debatable; there seems to be little evidence showing they directly parrot known work to generate large chunks of text, but their outputs are based on people's writing. If they merely copy text, is machine-written subject to copyright issues?
- University students recruit AI to write essays for them. Now what?
- Microsoft chases Google with ChatGPT-powered Bing
- Study finds AI assistants help developers produce code that's more likely to be buggy
- Alphabet reshuffles to meet ChatGPT threat
"There is, for instance, a question on whether text as well as images generated by large-scale generative models are considered novel or mere derivatives of existing work. "There is also a question on the ownership of text snippets, images or any media sampled from these generative models: which one of these owns it, a user of the generative model, a developer who trained the model, or content creators who produced training examples," ICML leaders asked.
"Since how we answer these questions directly affects our reviewing process, which in turn affects members of our research community and their careers, we must be careful and somewhat conservative in considering this new technology. Unfortunately, we have not had enough time to observe, investigate and consider its implications for our reviewing and publication process. We thus decided to prohibit producing/generating ICML paper text using large-scale language models this year," they added.
Whether academics decide to stick to the rules or not is up to them. There aren't any tools that effectively detect AI-generated text, and ICML will only rely on people flagging up suspicious papers during the review process. The text generated by machines is often plagued with factual errors, and authors writing their papers using AI will probably need to heavily edit its outputs. ®