Colleges snub Turnitin's AI-writing detector over fears it'll wrongly accuse students
By the time they graduate, employers will be making them use LLMs anyway
Some universities are opting out of using Turnitin-made software designed to detect whether text in essays and assignments submitted by students was written by AI.
Turnitin, for those who don't know, offers tools to teachers for identifying plagiarism in people's school work, and in April added the ability to check for machine-written prose. If left enabled, it automatically scans documents and processes the text into chunks, analyzing each sentence and assigning a score of 0 if it seems like it was composed by humans and 1 if it seems like it was automatically generated using AI. An average score is calculated for the file to predict how much text appears to be AI-written.
There is a larger question of how Turnitin detects AI writing and if that is even possible
Various American universities, however, including Vanderbilt, Michigan State, Northwestern, and the University of Texas at Austin have decided to not use this software over fears that it could lead to students being falsely accused of cheating, as noted by Bloomberg.
Turnitin admitted that its AI text detection tool isn't perfect, but claimed its false positive rate is less than one percent.
Vanderbilt University said even that figure was too high, and would result in mistakenly flagging 750 papers a year, considering it ran 75,000 papers through Turnitin's system in 2022.
"Additionally, there is a larger question of how Turnitin detects AI writing and if that is even possible. To date, Turnitin gives no detailed information about how it determines if a piece of writing is AI-generated. The most they have said is that the tool looks for patterns common in AI writing, but they do not explain or define what those patterns are," the institution's instructional technology consultant Michael Coley explained last month.
"There are real privacy concerns about taking student data and entering it into a detector that is managed by a separate company with unknown privacy and data usage policies. Fundamentally, AI detection is already a very difficult task for technology to solve (if it is even possible) and this will only become harder as AI tools become more common and more advanced. Based on this, we do not believe that AI detection software is an effective tool that should be used."
- OpenAI pulls AI text detector due to it being a bit crap
- No reliable way to detect AI-generated text, boffins sigh
- Plagiarism-sniffing Turnitin tries to find AI writing by students – with mixed grades
- Universities offered software to sniff out ChatGPT-written essays
Annie Chechitelli, Turnitin's chief product officer, told The Register that the AI-flagging tool should not be used to automatically punish students, and that 98 percent of its customers are using the feature. That said, it is automatically turned on, and teachers who don't want see its scores have to explicitly opt out. They could also leave the feature switched on and ignore it.
"At Turnitin, our guidance is, and has always been, that there is no substitute for knowing a student, their writing style and their educational background," Chechitelli told us.
"Turnitin's technology is not meant to replace educators' professional discretion. Reports indicating the presence of AI writing, like Turnitin's AI writing detection feature, simply provide data points and resources to support a conversation with students, not determinations of misconduct.
"It is one piece of a broader puzzle that includes many components."
Reports indicating the presence of AI writing simply provide data points to support a conversation with students
Even if software like Turnitin's AI detector isn't meant to be used as a way to automatically penalize students, the results still influence teachers greatly.
A lecturer at the University Texas A&M-Commerce, for example, raised eyebrows when he used ChatGPT in an attempt to detect whether papers he was marking were ML-written or not. Students' grades were put on hold, and some were cleared of cheating while some resubmitted their work.
Figuring out whether text was created by a human or machine is difficult. OpenAI took down its AI-output classifier six months after it was released due to its poor accuracy, and said it was trying to come up with new methods of detecting AI-generated content.
To further complicate matters, AI detection software can easily be thrown off when analyzing text that was written by humans and then edited using AI and vice versa. A previous study led by computer scientists at the University of Maryland found that the chances of the best classifiers detecting AI text is no better than a coin toss. ®