Is Digitizing Historical Texts a Bad Idea?

Several years ago I took a group of Mason students to Prague, Vienna, and Budapest. Among the things I’d planned for them was a visit to the Klementinum in Prague where the Codex Gigas (the “Devil’s Bible“) was on display. Needless to say, when I told them we were going to a library to look at a book, they were decidedly underwhelmed. Until they saw it up close and personal.

Codex_Gigas_devilAt 90cm x 50cm and weighing in at 75 pounds, it’s quite a book and was unlike anything they had seen or expected. More intriguing to them, though, was the legend surrounding the work. Created sometime between 1200 and 1230 in a monastery in Bohemia, the story that goes with the bible is that the devil himself helped a monk create it in just one night. In exchange, the monk included an image of the devil as part of the text decoration. Despite their earlier reluctance to go look at a book, the students pronounced the whole thing kind of cool.

I was reminded of that trip the other night during a tutorial I’m leading with four of our most talented doctoral students. One of those four, Jeri Wieringa, asked one of those questions students ask us with regularity that makes us think really hard. I’ll paraphrase what she asked: “If we digitize texts and present them to students as just so many pixels, are they losing an essential connection to the text as a historical artifact?”

This question led to an energetic discussion around our table. On the one hand, there are obvious advantages to digitizing texts. At the most obvious, the texts, especially those before the age of the typewriter, become much more legible and so therefore accessible to a wide audience. Anyone who has taught pre-typewriter texts knows just how reluctant students can be when it comes to trying to make sense of handwriting from back in the day. Even excellent tutorials like the one on decoding Martha Ballard’s diary can reinforce the notion that such handwriting is essentially unreadable except by experts or code breakers.

A second obvious advantage is that the text becomes fully searchable in ways that it can’t be when it is just an image of a document. Our Papers of the War Department project here at RRCHNM is a great example of the advantages of having transcribed texts to sort through and analyze using the text analysis algorithm of your choice.

Finally, making the text available in this way opens up any digitized collection to crawling by the various search engines, thereby opening up the collection to a much larger audience.

But, and this was the but that we got stuck on in our discussion, the artifact itself can disappear from the view of the researcher if an image of the original is not also available to the researcher. We really liked the War Department project because that image is there for users to see any time they want. [NB: I edited this paragraph because in the original, my wording made it sound as though images weren’t available on the War Department site.]

To put it another way, the coolness of the text as artifact disappears when all the researcher/student sees is black pixels on a white screen. Yes, it’s much more readable and accessible. But there is a bigger potential problem–and this is the one that really troubled Jeri. An essential task of the historian is to assign greater or lesser value to a particular historical source based on his/her growing expertise in a given subject. Some documents are just more important to a given problem or interpretation than others and it’s up to us to help others see that.

But if all documents are reduced to black pixels on a white screen, they start to seem all the same. Given that students/novice historians often have a difficult time placing sources in a hierarchy of importance that they are developing, if all texts look the same, are we making it more difficult for them to develop this skill of prioritizing some sources over others?

We arrived at no answer in our conversation and despite two weeks of ruminating on the issue, I still don’t have one. I’m just going to have to worry about this one for a while longer.

5 thoughts on “Is Digitizing Historical Texts a Bad Idea?

  1. Sarah

    Great questions, and ones that I worry about, too. Digitized documents do tend to all look the same–they’re the same size (thumbnail), the same depth (flat), the same materiality (pixels). As someone interested in the materiality of texts (and the liveness of theatrical performance), the flat consistency of digitization in its current incarnation bothers me. But I do think that as digitization moves beyond the rhetoric of increasing access to documents, there’s a possibility of more exciting approaches. I’ve been following the work of the Great Parchment Book project with interest, especially in light of the tools it’s been developing for wrestling with reshaping and recovering the book. The key, I think, will be to continue to put historians, literary scholars, bibliographers, and digital technologists in conversation with each other to see what it is we’re not seeing!

  2. professmoravec

    yes, as someone who can still recall vividly the day I “discovered” something in a library and decided not to attend grad school and to instead get a Ph.D. and “discover” stuff for the rest of my life, I do worry about that. I also worry about serendipity, which has been a big part of my research experience. That OHHHH what the hell is this doing here WOW yay didn’t know I wanted this.

  3. Stephen Ramsay

    “Some documents are just more important to a given problem or interpretation than others and it’s up to us to help others see that.”

    I’m surprised, Mills, to see you making what is ultimately a “book-in-the-bathtub” argument about the primacy of the original artifact. And you betray yourself with the sentence above. Does your problem involve dating a text through paleographical analysis? Determining the dissemination of a printer’s work through watermarks? Sourcing the carmine in the ink of a medieval manuscript? Looking for an underpainting beneath a portrait?

    If so, your problem clearly necessitates access to the original work. But if you are concerned merely to read it, it seems to me very hard to argue against a digital copy. And the truth is that even digital copies can rival the originals for problems that apparently involve the “thingness” of the thing. Scans of the Beowulf manuscript — which no responsible scholar should ever touch — are of such density that one can see the hills and valleys of the vellum. I’m unable to imagine what it is about scans of the War Papers that make the original “disappear from view” or resistant to prioritization as historical sources. Are you prepared to argue that Spencerian handwriting moves documents up and down the hierarchy of importance?

    Devils in the margins are, of course, cool. But turning that into an argument for the general surpremacy of the original artifact reveals, in most cases, nothing but a thinly veiled nostalgia. All things being equal, I’d much rather hold the Codex Vaticanus — or, for that matter, the Haldeman diaries — in my hand. But that is because, to paraphrase Benjamin, we are addicted to aura, and not because there is no other substitute.

  4. Jeri Wieringa

    In response to the concern about fetishizing the physical object, I wanted to clarify my concerns.

    The conversation came out of our discussion of using existing digital resources for teaching history with primary materials. The framework of the conversation was the theory that using primary material helps in teaching historical thinking by allowing students to interact with the complexity of the historical record. An assumption implicit in this approach is that students will find history more engaging when they are asked to work with the “stuffs” of history.

    My concern came from repeatedly seeing primary materials presented in such a way as to remove all visual cues of their origins or historicity or context. (For example, see and Visually, the “historical” text is the same as the framing text. My first concern is that this seems to be at odds with the pedagogical goals these sites are hoping to promote.

    (As a side note, I don’t think this is a problem with the Papers of the War Department.)

    The larger concern, however, is about the loss of contextual information in our efforts to digitize the past. While I agree that most scholars aren’t engaged in performing analysis on the paper or the materials itself, the visual cues of the material context are important for historical scholarship. And, as a corollary to this, seeing the surrounding material, whether that be marginalia, additional stories in a newspaper or periodical, or the like, is important in building a narrative about the past. The problem is not the hierarchy of sources but presenting the stuffs of the past in ways that capture more than just the ideas expressed.

