AI tools burst onto the public consciousness last year with a new release of ChatGPT that seems pretty smart to most people. Itโs not yet smarter than a human. But itโs surprisingly good. And itโs fast. And itโs cheap.
I have no interest in writing my novels using AI. My novels should be mine, not somebody elseโs.
But a lot of people use AI as a research tool. As a better version of Google.
What Could Possibly Go Wrong?
Youโve probably heard that ChatGPT and other tools that use โlarge language modelsโ are prone to hallucination. Meaning they tend to make up an answer when they donโt know.
Whatโs the story here? True or not true?
I recently had a research question that I took to ChatGPT to see how it would do. As some of you know, I write historical novels, and Iโve read the works of many historians over the years. Iโve read so many, itโs sometimes hard to remember who exactly said what.
I had a vague recollection that one particular historian (weโll call him Fred) had written on a particular topic. But I wasnโt sure. A search on Google didnโt bring up anything. So I asked ChatGPT what โFredโ had said on this subject.
ChatGPT replied with a general high-level summary of Fredโs thoughts on this subject, which it organized as five bullet points. I am very familiar with Fredโs writings, and these certainly sounded like things Fred would say, but I was looking for references, exact quotes.
I asked ChatGPT for a specific reference for bullet point 4. It immediately referred me to one of Fredโs well-known books, which I have in my library. I bought it many years ago, the fifth edition of the book. ChatGPT referred me to the sixth edition of the book, and it said I could find Fredโs thoughts explained well in Chapter 9. It even gave the title of Chapter 9.
I opened my copy of the fifth edition and found that the title of Chapter 9 was not the one ChatGPT had given. So I went to Amazon and bought the e-book version of the sixth edition, just so I could drill down to the actual words of Fred.
When I opened the sixth edition, I found that the title of Chapter 9 in this edition was the same as in the paper copy I bought 40 years ago. ChatGPT had gotten the chapter title wrong. It wasnโt close.
Worser and Worser
I told ChatGPT it had made a mistake, that it had given the wrong title for Chapter 9, and in fact that chapter had nothing to do with the question Iโd asked in the first place.
ChatGPT immediately apologized for the confusion. It told me the actual material I wanted was found in Chapter 8 and Chapter 10 of the book. And it again gave me the titles of these chapters.
But both titles were wrong, and those chapters also had nothing to do with the original question. It seemed that ChatGPT was simply digging itself in deeper and deeper.
I was getting impatient. I told ChatGPT that neither chapter title was correct, and I asked for a direct quote of Fredโs own words. I figured I could then do an electronic search for any prominent words in the quote, and that would take me to the passage where Fred dealt with my question.
ChatGPT again apologized for the โconfusionโ and gave me a direct quote from Fred. It was two full sentences, about 70 words long. Iโve read enough of Fredโs books to know his voice. The quote sounded exactly like what he would have said. So I searched for several words in the quote.
They werenโt there. ChatGPT had invented a direct quote from Fred. When people talk about hallucinations of ChatGPT, this is what they mean.
I challenged ChatGPT again, telling it that I have the book, and the quotation it had given me was NOT in the book anywhere.
More Apologies
ChatGPT apologized again. I will say this for itโChatGPT is good at apologizing. Iโve known many people who make up stuff but never apologize for their mistake. ChatGPT at least will admit itโs wrong when you confront it.
ChatGPT did more than apologize though. It gave me a shorter quotation from Fredโs book, giving me the page number on which the words could be found. I checked. Another hallucination.
My response to ChatGPT was very terse: โI have the book. This quotation is not found in it.โ
You can guess what happened next. ChatGPT apologized again, and gave me another quote that sounded exactly like what Fred would say. This time, ChatGPT attributed it to Chater 8.
But once again, the quote was not in the book. So I challenged ChatGPT yet again, and again it apologized and gave me another quote that sounded perfectly authentic, but wasnโt.
The Bottom Line
By the end of the session, ChatGPT had given me five apologies for incorrect statements. But I never did find out what Fred thinks on the particular topic.
Letโs be clear. When I started the session, I had a vague recollection that Fred had written something once on this topic, but I couldnโt remember exactly what he said or where it was found. I knew that I didnโt know the answer.
During the session, ChatGPT confidently gave me five answers that were no better than my vague recollection. They were no worse, but I was looking for better, and ChatGPT was just as bad at pulling up the actual quote as I was. But ChatGPT didnโt know that it didnโt know.
Thereโs a saying thatโs been going around for awhile, that โYou donโt know what you donโt know.โ This is often associated with the Dunning-Kurger effect, which says (very roughly) that people who are very ignorant of a subject often overestimate their level of understanding of it, whereas highly informed people accurately estimate their level of understanding.
Or in other words, ignorant people donโt know that they donโt know something. But competent people know when they know, and they also know when they donโt know.
I am pretty well-read in my chosen time period of history, but there are many things I donโt know. And Iโm at least competent enough to know that I donโt know something.
ChatGPT does not know when it doesnโt know something. And thereโs our hazard.
Because ChatGPT was trained on information scraped from the internet. And now many people are using it to write articles to post on the internet. Thatโs much easier than doing the hard word yourself. In a morningโs work, you could use ChatGPT to write dozens of articles.
Which means that future variants of ChatGPT and other large language models will probably be trained on information that was hallucinated by their own previous versions. That canโt end well.
Should You Use ChatGPT for Research?
Yes, of course you should. I recently asked ChatGPT a math question which Iโd been puzzling over for a couple of hours. It instantly suggested a possible solution. That wasnโt quite what I needed, so I pointed out some flaws in its suggestion. It immediately suggested an improvement. That was also not quite what I needed, but it was close. And I solved the math problem.
The moral is this: Trust, but verify. (Thatโs an old Russian proverb that Ronald Reagan learned about and repeated often. John Kerry later updated it to โVerify and verify.โ)
Use ChatGPT for ideas. It can be very creative. But verify everything. If ChatGPT gives you a direct quote, with references and page numbers, look it up and make sure.
Human sources are fallible too, but good human sources written by experts in their field have a huge advantage over ChatGPTโthey know when they donโt know.