AI tools burst onto the public consciousness last year with a new release of ChatGPT that seems pretty smart to most people. It’s not yet smarter than a human. But it’s surprisingly good. And it’s fast. And it’s cheap.
I have no interest in writing my novels using AI. My novels should be mine, not somebody else’s.
But a lot of people use AI as a research tool. As a better version of Google.
What Could Possibly Go Wrong?
You’ve probably heard that ChatGPT and other tools that use “large language models” are prone to hallucination. Meaning they tend to make up an answer when they don’t know.
What’s the story here? True or not true?
I recently had a research question that I took to ChatGPT to see how it would do. As some of you know, I write historical novels, and I’ve read the works of many historians over the years. I’ve read so many, it’s sometimes hard to remember who exactly said what.
I had a vague recollection that one particular historian (we’ll call him Fred) had written on a particular topic. But I wasn’t sure. A search on Google didn’t bring up anything. So I asked ChatGPT what “Fred” had said on this subject.
ChatGPT replied with a general high-level summary of Fred’s thoughts on this subject, which it organized as five bullet points. I am very familiar with Fred’s writings, and these certainly sounded like things Fred would say, but I was looking for references, exact quotes.
I asked ChatGPT for a specific reference for bullet point 4. It immediately referred me to one of Fred’s well-known books, which I have in my library. I bought it many years ago, the fifth edition of the book. ChatGPT referred me to the sixth edition of the book, and it said I could find Fred’s thoughts explained well in Chapter 9. It even gave the title of Chapter 9.
I opened my copy of the fifth edition and found that the title of Chapter 9 was not the one ChatGPT had given. So I went to Amazon and bought the e-book version of the sixth edition, just so I could drill down to the actual words of Fred.
When I opened the sixth edition, I found that the title of Chapter 9 in this edition was the same as in the paper copy I bought 40 years ago. ChatGPT had gotten the chapter title wrong. It wasn’t close.
Worser and Worser
I told ChatGPT it had made a mistake, that it had given the wrong title for Chapter 9, and in fact that chapter had nothing to do with the question I’d asked in the first place.
ChatGPT immediately apologized for the confusion. It told me the actual material I wanted was found in Chapter 8 and Chapter 10 of the book. And it again gave me the titles of these chapters.
But both titles were wrong, and those chapters also had nothing to do with the original question. It seemed that ChatGPT was simply digging itself in deeper and deeper.
I was getting impatient. I told ChatGPT that neither chapter title was correct, and I asked for a direct quote of Fred’s own words. I figured I could then do an electronic search for any prominent words in the quote, and that would take me to the passage where Fred dealt with my question.
ChatGPT again apologized for the “confusion” and gave me a direct quote from Fred. It was two full sentences, about 70 words long. I’ve read enough of Fred’s books to know his voice. The quote sounded exactly like what he would have said. So I searched for several words in the quote.
They weren’t there. ChatGPT had invented a direct quote from Fred. When people talk about hallucinations of ChatGPT, this is what they mean.
I challenged ChatGPT again, telling it that I have the book, and the quotation it had given me was NOT in the book anywhere.
More Apologies
ChatGPT apologized again. I will say this for it—ChatGPT is good at apologizing. I’ve known many people who make up stuff but never apologize for their mistake. ChatGPT at least will admit it’s wrong when you confront it.
ChatGPT did more than apologize though. It gave me a shorter quotation from Fred’s book, giving me the page number on which the words could be found. I checked. Another hallucination.
My response to ChatGPT was very terse: “I have the book. This quotation is not found in it.”
You can guess what happened next. ChatGPT apologized again, and gave me another quote that sounded exactly like what Fred would say. This time, ChatGPT attributed it to Chater 8.
But once again, the quote was not in the book. So I challenged ChatGPT yet again, and again it apologized and gave me another quote that sounded perfectly authentic, but wasn’t.
The Bottom Line
By the end of the session, ChatGPT had given me five apologies for incorrect statements. But I never did find out what Fred thinks on the particular topic.
Let’s be clear. When I started the session, I had a vague recollection that Fred had written something once on this topic, but I couldn’t remember exactly what he said or where it was found. I knew that I didn’t know the answer.
During the session, ChatGPT confidently gave me five answers that were no better than my vague recollection. They were no worse, but I was looking for better, and ChatGPT was just as bad at pulling up the actual quote as I was. But ChatGPT didn’t know that it didn’t know.
There’s a saying that’s been going around for awhile, that “You don’t know what you don’t know.” This is often associated with the Dunning-Kurger effect, which says (very roughly) that people who are very ignorant of a subject often overestimate their level of understanding of it, whereas highly informed people accurately estimate their level of understanding.
Or in other words, ignorant people don’t know that they don’t know something. But competent people know when they know, and they also know when they don’t know.
I am pretty well-read in my chosen time period of history, but there are many things I don’t know. And I’m at least competent enough to know that I don’t know something.
ChatGPT does not know when it doesn’t know something. And there’s our hazard.
Because ChatGPT was trained on information scraped from the internet. And now many people are using it to write articles to post on the internet. That’s much easier than doing the hard word yourself. In a morning’s work, you could use ChatGPT to write dozens of articles.
Which means that future variants of ChatGPT and other large language models will probably be trained on information that was hallucinated by their own previous versions. That can’t end well.
Should You Use ChatGPT for Research?
Yes, of course you should. I recently asked ChatGPT a math question which I’d been puzzling over for a couple of hours. It instantly suggested a possible solution. That wasn’t quite what I needed, so I pointed out some flaws in its suggestion. It immediately suggested an improvement. That was also not quite what I needed, but it was close. And I solved the math problem.
The moral is this: Trust, but verify. (That’s an old Russian proverb that Ronald Reagan learned about and repeated often. John Kerry later updated it to “Verify and verify.”)
Use ChatGPT for ideas. It can be very creative. But verify everything. If ChatGPT gives you a direct quote, with references and page numbers, look it up and make sure.
Human sources are fallible too, but good human sources written by experts in their field have a huge advantage over ChatGPT—they know when they don’t know.