Artificial Intelligence (ChatGPT) Tries the Civil War

ECW welcomes back guest author Dan Walker

How much use would artificial intelligence (AI) be for Civil War research? Lately the news has been full of stories about advances in AI and its benefits and risks for today’s world. But what about historical questions? The Civil War was a long time ago.

ChatGPT is an AI program released to the public. Is it worth using? From my experience I would say yes, but if we expect to sit back and watch it produce publishable texts, we will be disappointed, and even using it as a helper has serious risks.

What ChatGPT is and How it is used:

ChatGPT is called a Large Language AI model. That is, it is trained to understand and interact with the verbal constructions ordinary people use. To use it, go to OpenAI.com, establish an account, select “New Chat,” type a question or prompt into the blank panel, and click the send icon. A response, even a lengthy one, appears in a second or two—which can be disconcerting! A user who is inactive for long enough will be logged out, but can resume at the next log-in. At that time, a title which the program gave the chat will appear in a list of those the user has already created.

What it does:

According to its mission statement,[i] ChatGPT’s purpose is to collect and analyze information and to help the user understand it. Its “training material” stops at October 2021. Since the Civil War, its data, and most authoritative commentary on it dates to well before that, research in this field should be well beyond the pitfalls of emotion and intuition, which the application itself warns us are not in its lane.

What could go wrong then? A good bit, it seems. I ran a few test questions, then tried a deeper query on a topic that interested me, but which I did not know much about.

Here’s what I learned, which might be grouped into the Good, the Bad, and the Weird:

A Case Study: Confederate Desertion and Draft-Dodging

The Good:

Already warned to avoid speculation, I tried what I thought was a straightforward, fact-based prompt: the extent and effect of desertion and conscription-dodging in the South during the War, especially in the backcountry. By now I had also learned to ask for sources, not assume they would be cited. In a matter of seconds, I got an abstract followed by an eight-paragraph paper citing four sources that, when I looked them up, did exist (yes, a low bar), and seemed apt for the topic.

Here are the prompt and the abstract:

[ii] ChatGPT (Mar 23 version), 2023. Chat accessed at: https://chat.openai.com/chat/c65ddd33-191a-4d87-beb6-1e1a9cbeec03
Interesting: The app used “we” whereas elsewhere it used “I.” Why? Was it assuming a more “editorial” viewpoint for a “paper”? A small point, but I’d been warned that the app has been rewarded for adopting academic conventions to enhance credibility for its work, however flawed. But as far as content went, the app got credit (so far) for the following—in just eight paragraphs:

  • Providing a good conceptual overview of the issues—useful for a researcher new to the topic.
  • Credibly estimating, as requested, the number of soldiers lost to the South through desertion and conscript-avoidance by the end of 1864 (about 215,000), citing sources.
  • Coherently summarizing what the literature suggests about causes and effects, identifying social, cultural, and sectional influences.
  • Suggesting possible sources for further research.

The Bad:

The paper’s assertions and supporting thought seemed solid, though the development through specific evidence was thin. To be fair, I had not specified a length and number of sources, but there were still problems:

(1) The four sources cited were respectable, but all secondary. I had not asked for primary sources, but the abstract had promised them, so where were they?

(2) There was no list of References, which I had asked for.

To my follow-up request, the Chat re-posted its essay, this time with a six-source References list: the original four, plus two primary ones: a September 9, 1864 letter from Confederate President Jefferson Davis to General Braxton Bragg and a January 23, 1863 letter to Davis from General Robert E. Lee.

Two primary sources—good!–but once I started to drill down, the documentation began to crumble. Here is part of the list where things went wrong:

[iii] ChatGPT (Mar 23 version), 2023. Chat accessed at: https://chat.openai.com/chat/c65ddd33-191a-4d87-beb6-1e1a9cbeec03
First, I noticed something I’d missed: the publication date for the Inscoe source seemed off by a year. It should be 1996, shouldn’t it, not 1997? The app admitted the error and apologized “for the confusion.” OK, fine. The source exists and seemed apt for its use. I actually ordered from Amazon another source on the list—More Damning Than Slaughter: Desertion in the Confederate Army, by Mark Weitz [iv], which I highly recommend—although again the app got the publication year wrong. Why?

Worse, I couldn’t find either of the two letters. The Davis letter to Bragg may be within the volume cited: Volume 10 of the Jefferson Davis papers was indeed published in 1991. But a search for that volume at the LSU Press website gives August 1864 as the end date for its contents. I could have looked around for Volume 11, but LSU didn’t have it, and life is short. After the app “apologize[d] for any inconvenience,” it suggested I look up the letter using the Official Records (OR), then provided this link: http://collections.library.cornell.edu/moa_new/waro.html[v]

But wait! If the letter is likely available in the OR, why didn’t the app simply tell me where it was? After all, the app itself cited the OR for the Lee letter to Davis. Thus, I hoped at least to find a direct hit for that source.

No such luck. Lee and Davis wrote a lot to each other in January, 1863—was Burnside going to try again after his December disaster at Fredericksburg? What was the meaning of all those gunboats and transports massing on the lower Potomac?—but nothing about desertion. Lee’s main complaint at this point was lack of provisions and the inefficiency of the Richmond-Fredericksburg railroad.

If, in frustration, we ask the application how these things can happen, we are likely to see the following, from which I have cut the introductory apology:

[vi]ChatGPT (Mar 23 version), 2023. Op. cit.
The response closes with the advice to check its information with “other reliable sources” –ironic advice since that would imply the app itself is “reliable,” which already is in doubt.

Results of the test:

If I wanted a well-organized introduction to a topic with good suggestions for follow-up, and if I was willing to do a thorough check of almost every detail, willing to ask questions, and willing to put up with surprising gaps in the app’s “training” base, then I got something worthwhile. But if I was looking for a reliable peer-level co-researcher, ChatGPT was not it.

Source problems reported by other researchers:

Researchers in other fields have found similar problems with sources. The application may even seem to have made up a source, as David Wilkerson, editor of the Oxford Review, was forced to conclude.[vii]

In one case an art professor found that the title of a book cited by ChatGPT, which a colleague of hers had supposedly written, was likely a misidentified article. Apparently the app had confused it with a chapter the colleague had contributed to a book, which the app had then converted into a whole book title—or something like that. It took several “sorry for the confusion” replies to discover this[viii] —which even I may have misunderstood.

The “Weird”:

Problems with certain topics: Weaknesses in thinking

In my Southern-desertion test case, the question was the kind the app handles best: one for which there is, presumably, a “right” answer, or at least for which good answers are “convergent,” as we say.

But if we want human-level speculation and contextual awareness, we may get answers that seem weirdly off base (e.g. don’t try the Jackson-at-Gettysburg question—just trust me). The app will do better investigating what did happen than what might have happened.

But even what did happen may confuse it, if it involves motives. Why, I asked, did Confederate General Robert E. Lee reorganize the Army of Northern of Northern Virginia into three corps in 1863. I got a response with two usual reasons—the irreplaceability of Jackson and the general attrition among Confederate officers and men at Chancellorsville—but then this:

[ix] ChatGPT (Mar 23 version), 2023. Op. cit.
I’d never seen this! I asked for a source and was told first that it was “a common interpretation among historians.” Really? Could I see the source? I now got this admission:

[x] ChatGPT (Mar 23 version), 2023. Op. cit.
But I didn’t give up: Where had it seen this claim, “widely accepted” or not? The app could not tell me. So, as with the Lee-to-Davis letter, the app may admit to having made a mistake, but may not be able to tell where it got the wrong idea. Should I have asked for citations in the first place? Absolutely, but as we’ve seen, even that might not have worked.

What I learned:

The app is a useful aid if we understand its limits, focus on answerable questions, avoid speculative ones, and, if we have a concern, ask. In fact, this is pretty much what ChatGPT and Wikipedia[xi] have to say about each other: Good place to start, but check everything.

Dan has been an educator for more than fifty years, teaching English, creative writing, and interdisciplinary humanities in high school, and professional education at the undergraduate and graduate levels, first at the University of Mary Washington, then in the Virginia Community College System’s career-switcher program. He has a B.A. in History (1969) from William and Mary, an M.A. in English from U. Va., and an Ed. D. in English Education, Educational Supervision, and Research. He worked as a seasonal NPS Ranger for several years. Dan has published prizewinning poetry and short fiction, plus one book and several articles in professional education. He has also published three novels.


 

[i]  OpenAI.com, 2023 (Jan. 30). “Mission Statement.”

[ii]   ChatGPT (Mar 23 version), 2023. Chat accessed at: https://chat.openai.com/chat/c65ddd33-191a-4d87-beb6-1e1a9cbeec03

[iii]   Ibid.

[iv]  Weitz, Mark A. 2008. More Damning than Slaughter: Desertion in the Confederate Army. Lincoln, NE: University of Nebraska Press.

[v]  ChatGPT (Mar 23 version), 2023. Op. cit.

[vi]  Ibid.

[vii]  Wilkerson, David, 2023. “Is ChatGPT Making Up References?” Oxford Review Briefings. Accessed at https://oxford-review.com/chatgpt-making-up-references/

[viii]  “In the News: ChatGPT Goes Rogue, Fabricating Citations by Hal Foster and Carolyn Yerkes,” Department of Art & Archaeology, Princeton University. Accessed at: https://artandarchaeology.princeton.edu/whats/news/news-chatgpt-goes-rogue-fabricating-citations-hal-foster-and-carolyn-yerkes

[ix]   ChatGPT (Mar 23 version), 2023. Op. cit.

[x]  Ibid.

[xi] This useful lengthy article describes ChatGPT’s strengths and weaknesses, providing examples and 150 endnotes—which, as all of us know, should be checked for confirmation. If asked, ChatGPT will compare itself to Wikipedia and will give the same warning about source-checking. Wikipedia.org . “ChatGPT,” (2023), last updated April 16, 2023. Accessed at: https://en.wikipedia.org/wiki/ChatGPT



6 Responses to Artificial Intelligence (ChatGPT) Tries the Civil War

  1. Very interesting and useful article! It confirms my general thoughts about AI; you’d better know the subject very well going in, and not expect AI to do any serious scholarship for you outside possibly turning up some potential new sources and making a list. I was intrigued, though, by the initial AI response to the question about Lee’s reorganization of his army into three corps – since there was nothing from which to source AI’s conclusion that Lee may have been influenced by the Union Army’s actions, could it be that a bit of deep learning resulted in an original thought? Or did AI just pick it up as a stray remark somewhere and pile it in, sans source, in an effort to get a better grade? (Thinking emoji).

  2. Thank you very much for this thoughtful, insightful article concerning ChatGPT! Your article matches work that I did with the same results. It appears the artificial intelligence attempts to extrapolate ideas and cite them as facts. Sometimes the sources can also be deeply flawed, even when they are legitimate like the ORs.
    Bravo! Job well done!

    1. That was my thinking—that the “machine learning” element may have an algorithm or two that encourage extrapolation to fill gaps in an apparent line of thought or evidence chain. That may work in electronic music when you’re trying to map sound samples for five or six notes to cover the entire keyboard, but not in research in social sciences.
      DW

  3. Dan, thanks for sharing your dive into ChatGBT. Every little bit helps everyone understand its positives, strengthe, shortcomings, and negatives. You have given me some things to think about with my students about responsible use of AI.

Please leave a comment and join the discussion!