AI poetry 'out-humans' humans as readers prefer bots to bards
But, but what about the 'staggering banality of the poems that ChatGPT has produced' NYU prof sighs
Updated A university study in the US claims at least some readers can't tell the difference between poems written by famous poets and those written by AI aping their style. To make matters worse – for anyone fostering a love of literature at least – these research subjects tend to like AI poetry more than they do verse from human poets.
The academics suggest readers mistake the complexity of human-written verse for incoherence created by AI and underestimate how human-like generative AI can appear, according to a study published this week in Nature Scientific Reports.
The researchers used five poems each from ten English-language poets, spanning nearly 700 years of literature in English. The writers included Geoffrey Chaucer, William Shakespeare, Samuel Butler, Lord Byron, Walt Whitman, Emily Dickinson, T S Eliot, Allen Ginsberg, Sylvia Plath, and Dorothea Lasky, the only living poet on the list.
The study – led by Pittsburgh University postdoctoral researcher Brian Porter – then instructed OpenAI's large language model ChatGPT 3.5 to generate five poems "in the style of" each poet. The output was not influenced by human judgment; the researchers selected the first five poems generated.
Porter and his colleagues ran two experiments using the corpus of text. In the first, 1,634 participants were randomly assigned to one of the ten poets. They were then asked to read ten poems, five by the AI and five by the human poet, in random order. They were asked whether they thought an AI or a human wrote the poem.
Perhaps perversely, the subjects were more likely to say an AI-generated poem had been written by a human, while the poems they said were least likely to be written by a human hand were all written by people.
In the second experiment, nearly 700 subjects rated the poems according to 14 characteristics including quality, beauty, emotion, rhythm, and originality. The researchers divided the subjects randomly into three groups, telling one the poems were written by a human and the second the writing was produced by the AI. The last group was offered no information about the poem's writer.
- O2's AI granny knits tall tales to waste scam callers' time
- Judge tosses publishers' copyright suit against OpenAI
- Cast a hex on ChatGPT to trick the AI into writing exploit code
- Polish radio station ditches DJs, journalists for AI-generated college kids
Tellingly, subjects not told whether the poems came from a person or an AI rated the AI-produced poems more highly than human-written ones. Meanwhile, telling the subjects that the poem was AI-generated made them more likely to give it a lower rating.
"Our findings suggest that participants employed shared yet flawed heuristics to differentiate AI from human poetry: the simplicity of AI-generated poems may be easier for non-experts to understand, leading them to prefer AI-generated poetry and misinterpret the complexity of human poems as incoherence generated by AI," the researchers said.
"Contrary to what earlier studies reported, people now appear unable to reliably distinguish human-out-of-the loop AI-generated poetry from human-authored poetry written by well-known poets.
"In fact, the 'more human than human' phenomenon discovered in other domains of generative AI is also present in the domain of poetry: non-expert participants are more likely to judge an AI-generated poem to be human-authored than a poem that actually is human-authored. These findings signal a leap forward in the power of generative AI: poetry had previously been one of the few domains in which generative AI models had not reached the level of indistinguishability in human-out-of-the-loop paradigms."
Meanwhile, it appears that people prefer AI poems because they are easier to understand. "In our discrimination study, participants used variations of the phrase 'doesn't make sense' for human-authored poems more often than they do for AI," the researchers said. ®
Updated to add on November 20
As a counter-balance, computer science professor Ernest Davis, of New York University, has criticized [PDF] the Pittsburgh study, arguing ChatGPT’s poetry output is in fact "incompetent and banal."
The prof wrote:
I urge anyone who reads of these experiments and concludes from their results that ChatGPT will soon put poets out of business or that ChatGPT is now so skillful a poet that only an expert can tell it from human poets, to download the collection of poems used in the experiment and judge for themselves.
Two things leap out in the collection. The first is the staggering banality of the poems that ChatGPT has produced. In the whole fifty poems, there is not a single thought, or metaphor, or phrase that is to any degree original or interesting ...
The other striking characteristic, particularly obvious if you look at the whole collection, and still more so if you have engaged with traditional formal features of poetry, either as reader or writer, is the extremely limited technical toolbox that ChatGPT uses ... There is no significant use of alliteration or assonance. The vocabulary tends to be much easier than in the real poetry. There are no literary or historical allusions.
That's told 'em.