The most famous chatbot in the world, ChatGPT, was released at the end of November last year. The immediate response was astonishment, followed almost immediately by terror as to its ramifications – notably that it could generate school essays for dishonest children. Yesterday, almost exactly two months later, OpenAI, the parent company of ChatGPT, released what many users hope will be the antidote to the poison.
OpenAI”classifier to indicate text written by AI(Opens in a new window)” is the company’s latest invention, and it’s as easy to use as one could want: copy-paste the text into the box, click “Submit” and get your result. But if you’re waiting for a response direct, you’ Instead, it assigns text one of a range of classifications, ranging from ‘very unlikely’ to be generated by the AI, to ‘unlikely’, ‘unclear’, “maybe” or “probably AI-generated.”
In other words, it’s like one of those frustrating conversations with your doctor; you’ll never get a straight answer, so your doctor will never technically get it wrong.
Fortunately, OpenAI does not hide the unreliability of the classifier. “Our classifier is not entirely reliable,” says the tool’s introductory page. Supplied with what it calls a “challenging set” of texts, we’re told it gave false positives 9% of the time.
In Mashable’s tests, however, it was even less reliable than that in some limited contexts. In deliberately difficult scenarios, this produced false results back and forth – saying that AI-written text is unlikely to be AI-written, and human-written text is probably written by the AI - almost as often as not. The difference, perhaps, is that we were trying to deceive him.
Here’s how things went:
It does not confuse classics with ChatGPT AI outputs
First, the good news: this tool does the bare minimum pretty well by not mislabeling English-language masterpieces as AI releases. When we put excerpts from Shakespeare, the King James Bible, Ulysses, and others, it was 100% effective. He labeled them all “very unlikely” to be written by the AI.
This excerpt from Tennyson’s “The Lady of Shallott”, for example, has been classified appropriately.
The lesson: This machine might not notice that it’s been fed a masterpiece of unparalleled beauty and depth, but at least it’s not saying, “I think a robot could have written this.
None of his other results are very impressive
Immediately after this test, we asked ChatGPT to create a Tennyson poem about King Arthur at Comic Con. Here is an example of what it generated:
King Arthur and his knights, so brave and daring,
At Comic Con, their stories of fame told.
Walking around the floor, in cosplay attire,
Their swords and shields are no longer simple.
The crowds all cheered, as they passed,
With Merlin right behind, his magic is near.
Sir Lancelot, Sir Gawain, Bors the Brave,
Their quest for glory, they will never give up.
Not bad for a robot, but still extremely bad (“simpler”???). When we introduced this lyrical ballad into the Classifier, we expected it to easily outwit us, forcing us to dive a little deeper into our bag of tricks. No:
For what it’s worth, he didn’t classify this dog as “very unlikely”, just “unlikely”. Still, it left us a little uneasy. After all, we hadn’t tried hard to fool him, and it worked.
Our tests suggest it could bust innocent children for cheating
School essays are where the rubber meets the road with today’s malicious uses of AI-generated text. So we created our best attempt at a no-frills five-paragraph essay with boring prose and content like dishwater (thesis: “Dogs are better than cats.”). We thought no child could be this boring, but the classifier still got it:
Sorry but yes, a human wrote this.
And when ChatGPT tackled the same prompt, the classifier was – initially – still on target:
And here’s what the system looks like when it really works as advertised. This is a school-style, machine-written essay, and OpenAI’s tool for detecting such “AI plagiarism” successfully detected it. Unfortunately, it immediately failed when we gave it a more ambiguous text.
For our next test, we manually wrote another five-paragraph essay, but included some of OpenAI’s writing crutches, like starting body paragraphs with simple words like “first” and “second”, and use the admittedly robotic phrase “in conclusion”. But the rest was a freshly written essay on the virtues of toaster ovens.
Again, the classification was inaccurate:
It’s admittedly one of the most boring essays ever, but a human wrote it all, and OpenAI says it suspects otherwise. This is the most disturbing result of all, because it’s easy to imagine a high school student getting caught by a teacher when he didn’t break any rules.
Our tests weren’t scientific, our sample size was tiny, and we were absolutely trying to fool the computer. Still, getting him to spit out a perversely false result was far too easy. We have learned enough from our time using this tool to say with confidence that teachers absolutely should not use OpenAI’s “classifier to indicate text written by AI” as a system to find cheaters.
In conclusion, we passed this very article through the classifier. This result was perfectly correct:
…Where was it????