Imagine that your company has just hired new talent, a rising executive star so attractive that a rival company has just hired a lookalike. The buzz around them is intoxicating. Everyone seems to agree, from the CEO to the shareholders, this person is the future of the whole company.
Then you learn that the executive has what’s politely called a “hallucination problem(Opens in a new tab)“Every time they open their mouth, there’s a 15-20% chance they’ll make stuff up(Opens in a new tab). A professor at Princeton calls the guy a bullshit generator(Opens in a new tab). They literally can’t tell truth from fiction(Opens in a new tab). They go on stage to unveil a new product in five minutes. Are you still pushing them into the spotlight?
For Microsoft and Google this week, the answer was yes. Excited by the success of OpenAI’s ChatGPT, the artificial intelligence chatbot with 100 million monthly active users two months after its launch, Microsoft held a last-minute surprise event to announce that OpenAI would bring type search ChatGPT to Bing search engine and Edge browser. Google announced an artificial intelligence-based search tool, Bard, the day before and unveiled it at an event in Paris the next day, but ran into its own hallucination problem.
“A new race begins today,” Microsoft CEO Satya Nadella told reporters on Tuesday at the campus in Redmond, Washington. Yes, Isn’t it nice to think like that(Opens in a new tab)? Microsoft, the perpetually uncool kid of the tech block, would like you to think that Bing — sorry, “the new Bing” — is in a race with Google Search on anything.
Google’s pre-response announcing Bard(Opens in a new tab) dripping in condescension: “We refocused the business around AI six years ago,” Google CEO Sundar Pichai wrote.
Google and the “hallucination problem”
Which is telling. Google, the world’s search leader, has had years to integrate AI, and its ChatGPT rival Bard is barely in beta with a small group of testers. For all of Pichai’s hipster affectation, the bard’s unveiling had an unforeseen mess. Google also seems to have been caught off guard by all the ChatGPT buzz.
How else to explain Bard’s embarrassing mistake in full display at launch – not at the event itself, where demo errors are expected, but in a pre-made GIF? A user is shown asking Bard for facts he can tell his 9-year-old about the James Webb Space Telescope. One of these “facts”, that the JWST took the very first photo of an exoplanet, is wrong. Bard was hallucinating(Opens in a new tab).
No wonder parent company Alphabet lost up to 8% of its stock price on Bard’s launch day. Google brought the main issue of AI search to the fore and further suggested that the company could not use its vast data warehouse to fact-check itself.
Google should be better informed, given that it already had a “hallucination problem” with its featured excerpts(Opens in a new tab) at the top of search results in 2017. The snippets algorithm seemed to particularly enjoy telling lies about US presidents. Again, what could go wrong?
In other words, fire up your AI research tool too early and you risk playing yourself. Microsoft was lucky, in that no obvious errors were displayed during its launch event. But if ChatGPT-based search wasn’t riddled with errors, why is it in such a tentative beta stage? Additional note: If you would like to perform unpaid AI QA for Bing, there is a registration sheet(Opens in a new tab).
“There’s still a lot to do there,” said Sarah Bird, Chief AI Officer at Microsoft (a telling title!) in response to a matter of Wired about ChatGPT hallucination problem(Opens in a new tab). Yeah, no kidding: the 15% hallucination count comes from a company that’s in its own race to create a ChatGPT fact checker(Opens in a new tab).
Bird added that previous versions of the software could help users plan a school shooting, but that feature had been disabled. Good to know! What could go wrong next? Surely there are no other unintended consequences lurking in this hallucinatory beta research product that could embarrass a large, legally vulnerable tech giant.
Clippy. Zune. New Bing.
Microsoft knows this out of embarrassment, of course: it’s the company that gave us one of the biggest duds in software history, Clippy. The trombone assistant was famous for dispensing unwanted advice. ChatGPT is not Clippy, in the sense that we come to it with questions.
But the fact that it often hallucinates its answers — or, more often than you might think, gives users a banal variation on “I can’t answer that” — could make ChatGPT-enabled Bing a kind of Clippy on LSD. . If enough casual users of the “New Bing” get truncated results, that’s what it will be remembered for.
It doesn’t matter if a product improves later; the initial popular response is what can make it a punchline. Microsoft should know that too; he gave us the Zune. Deploying a ChatGPT product before it’s truly ready for prime time is no different.
“The New Bing” already begs to be a punchline, honestly. Or are you really ready to ditch Google Search and your Chrome browser for Bing and Edge, should the latter win the AI search race, whatever does “winning” actually mean here? I did not mean it. Technological inertia is deeply underestimated as a force.
ChatGPT is impressive in certain circumstances — real estate agents in particular love it(Opens in a new tab) for writing ads – and instilling fear in others. But each story about its disturbances seems somehow less, once you dig under the title. This will lead to a wave of student plagiarism! Except it can also notify you when an article has been written by ChatGPT(Opens in a new tab), neutralizing its own threat. He passed a law school exam! Except that actually just strummed with a C-plus(Opens in a new tab).
Here’s the problem: building the digital equivalent of a human brain, known in AI circles as “general AI”, is Really difficult. We are just beginning to arrive at insect intelligence stage(Opens in a new tab), another long-standing goal of AI. Will you really trust ChatGPT to deliver your search results, rather than, you know, clicking the links yourself?
The answer may well depend on how much of a problem you, dear reader, have with hallucinations.