Is Digitizing Historical Texts a Bad Idea (II)?

My previous post about digital historical text generated some very interesting comments, both here and on Twitter. I met with my students again last night and we had an extended discussion about those discussions, so thanks to everyone who chimed in. What follows is a summary, more or less, of our conversation last night.

We were particularly taken by Steve Ramsey’s critique of my post, especially the following paragraph:

If so, your problem clearly necessitates access to the original work. But if you are concerned merely to read it, it seems to me very hard to argue against a digital copy. And the truth is that even digital copies can rival the originals for problems that apparently involve the “thingness” of the thing. Scans of the Beowulf manuscript — which no responsible scholar should ever touch — are of such density that one can see the hills and valleys of the vellum. I’m unable to imagine what it is about scans of the War Papers that make the original “disappear from view” or resistant to prioritization as historical sources. Are you prepared to argue that Spencerian handwriting moves documents up and down the hierarchy of importance?

None of us was arguing that digitizing texts was, in and of itself, bad. We all agreed that access to the content of those texts was an unqualified good. And I’ve gone back into the original post and clarified my language about the War Department project, because the way I wrote one sentence made it sound as though I was unhappy with the scans of the documents (which are copies of the originals due to a fire that destroyed the originals — see the project page for more on this issue).

Nevertheless, we all agreed that as historians, we care about the “thingness” of the source, and we care a lot about that. Not because of some “thinly veiled nostalgia” for the thing itself, but because texts are both texts and historical artifacts and so students of the past need access to that thingness if they are to understand both aspects of the source — it’s content and its materiality.

The importance of the text itself is pretty obvious and so doesn’t need clarification. But the materiality does. We discussed, for instance, the problems posed in teaching using historical newspapers via a database like ProQuest Historical Newspapers. The ProQuest search delivers the story requested abstracted from the page that it appeared on. The full page is available as well, but unless students are taught what a newspaper is, how the arrangement of content on the page and its placement in a section is the result of a dynamic process involving editors, writers, and layout staff, they will have no sense for why the placement of that story matters sometimes as much as the content of the story itself. “Above the fold” or “below the fold” become meaningless when a database serves up only the story.

ProQuest at least returns a pdf of the original story, so students can see the type face and (often but not always) the images that went along with the story. And they can examine the headline and consider why a headline might be more sensational than the content of the story warrants — again, as a result of that dynamic process involving several actors I just described.

As for the hierarchy we assign to sources, we also agreed that sometimes we might just assign a different importance to a source based on things other than the words in the text — that all sorts of other factors, most of them material, might convince us that this or that source was of greater import. Knowing everything about the source — not just its words, but the marginalia, its placement in a collection, or where it was found — can all shed potentially important light on what the source means and meant to others at the time it was created or later.

Given all of that, we wanted some sort of best practices for digitizers, that included common standards about such things as images of the original to go with the plain text on a white screen. As Sarah Werner wrote in her comment, creating such standards will require historians, bibliographers, archivists, and technologists to get together and discuss, among other things, what they (and our students) aren’t seeing when all we get is black pixels on a white screen.

Posted in Posts | Tagged , , | Leave a comment

Is Digitizing Historical Texts a Bad Idea?

Several years ago I took a group of Mason students to Prague, Vienna, and Budapest. Among the things I’d planned for them was a visit to the Klementinum in Prague where the Codex Gigas (the “Devil’s Bible“) was on display. Needless to say, when I told them we were going to a library to look at a book, they were decidedly underwhelmed. Until they saw it up close and personal.

Codex_Gigas_devilAt 90cm x 50cm and weighing in at 75 pounds, it’s quite a book and was unlike anything they had seen or expected. More intriguing to them, though, was the legend surrounding the work. Created sometime between 1200 and 1230 in a monastery in Bohemia, the story that goes with the bible is that the devil himself helped a monk create it in just one night. In exchange, the monk included an image of the devil as part of the text decoration. Despite their earlier reluctance to go look at a book, the students pronounced the whole thing kind of cool.

I was reminded of that trip the other night during a tutorial I’m leading with four of our most talented doctoral students. One of those four, Jeri Wieringa, asked one of those questions students ask us with regularity that makes us think really hard. I’ll paraphrase what she asked: “If we digitize texts and present them to students as just so many pixels, are they losing an essential connection to the text as a historical artifact?”

This question led to an energetic discussion around our table. On the one hand, there are obvious advantages to digitizing texts. At the most obvious, the texts, especially those before the age of the typewriter, become much more legible and so therefore accessible to a wide audience. Anyone who has taught pre-typewriter texts knows just how reluctant students can be when it comes to trying to make sense of handwriting from back in the day. Even excellent tutorials like the one on decoding Martha Ballard’s diary can reinforce the notion that such handwriting is essentially unreadable except by experts or code breakers.

A second obvious advantage is that the text becomes fully searchable in ways that it can’t be when it is just an image of a document. Our Papers of the War Department project here at RRCHNM is a great example of the advantages of having transcribed texts to sort through and analyze using the text analysis algorithm of your choice.

Finally, making the text available in this way opens up any digitized collection to crawling by the various search engines, thereby opening up the collection to a much larger audience.

But, and this was the but that we got stuck on in our discussion, the artifact itself can disappear from the view of the researcher if an image of the original is not also available to the researcher. We really liked the War Department project because that image is there for users to see any time they want. [NB: I edited this paragraph because in the original, my wording made it sound as though images weren't available on the War Department site.]

To put it another way, the coolness of the text as artifact disappears when all the researcher/student sees is black pixels on a white screen. Yes, it’s much more readable and accessible. But there is a bigger potential problem–and this is the one that really troubled Jeri. An essential task of the historian is to assign greater or lesser value to a particular historical source based on his/her growing expertise in a given subject. Some documents are just more important to a given problem or interpretation than others and it’s up to us to help others see that.

But if all documents are reduced to black pixels on a white screen, they start to seem all the same. Given that students/novice historians often have a difficult time placing sources in a hierarchy of importance that they are developing, if all texts look the same, are we making it more difficult for them to develop this skill of prioritizing some sources over others?

We arrived at no answer in our conversation and despite two weeks of ruminating on the issue, I still don’t have one. I’m just going to have to worry about this one for a while longer.

Posted in Posts | Tagged , , | 5 Comments

No More Lying About the Past

Regular readers of this blog know that in 2008 I created a course called “Lying About the Past” in which my students studied how, over the past several centuries, a variety of people have created false versions of the past, for fun or profit. The goal of the course was to teach my students much greater skepticism about historical sources, especially online historical sources, and I feel very confident in saying that the course, which I taught a second time in 2012, achieved that goal with flying colors.

What made this course controversial, to a small degree in 2008 and to a much wider degree in 2012, was that in each iteration of the course the students created a historical hoax and turned it loose online for ten days to see if they could fool anyone. Because we were not in the business of creating what a colleague calls “zombie facts,” the students exposed their hoaxes after the allotted ten days and then assessed what had and hadn’t worked in their project and why.

Those who disagreed with the notion that my students should turn their (very innocuous) hoaxes loose for a few days felt that I was teaching my students to behave in very unethical ways, that we were somehow polluting the web, or that we had violated something one critic called the implied “academic trust network” that exists online. Of course, my students and I completely understood these criticisms–they were all issues we discussed in great detail in the course. You had to be there each semester to see the care my students took thinking through these and other ethical issues to understand just how central ethical discussions were to the entire course. In fact, I think it’s fair to say that my students spent more time discussing the ethics of the historical profession in this course than in any other history course they have taken or will take.

In 2012 I proposed to my department that Lying About the Past be made a part of the regular curriculum of the department, by which I mean the course would receive its own number and be added to the university catalog as one optional course among dozens that we offer. The undergraduate committee in my department decided that the proposal could go forward only if I agreed to change the central component of the course–make the hoaxes purely classroom presentations rather than turning them loose online. Because the fact that the hoaxes would be placed in front of an unknown audience is the thing that gave the course its energy, its excitement, and made it fun, changing the format in this way would have turned the class project into yet another abstract classroom only exercise and would have sucked the life out of the course. I therefore declined to make the change and the undergraduate committee subsequently rejected my proposal.

What this means is that I won’t be teaching Lying About the Past any longer at George Mason, which I’m sure will make my critics happy, especially Jimmy Wales, who pronounced himself “annoyed” about the whole thing.

But I also think it’s worth considering what the decision of the undergraduate committee means in terms of how we regulate teaching as opposed to research. In essence, my colleagues (who, by the way, I respect very much) decided that it was acceptable to tell a faculty member that he could not teach a course because they disagreed with the teaching methodology. Can you imagine the furor that would ensue if the word “research” were substituted for “teaching” in the previous sentence?

I asked several of my colleagues who had been at Mason for more than 20 years if they could remember a time when a professor had been denied the right to teach a course as he/she saw fit and none could. It’s an interesting and potentially disturbing precedent my colleagues have set, because it says that teaching methods can be regulated in ways we would never allow when it comes to our research.

I have another course up my sleeve that will be almost, but not quite as disruptive to our notions of how history can be taught as Lying About the Past was, and my department chair has signed off on it.  As soon as it is in the schedule of classes, I’ll be sure to post an advance notice here.

[NB: I'm posting this on March 31, not April 1 so that it's clear the entire above message is not a hoax. Trust me, it's not.]

[NB #2:For a recent interview with me about the course, see Aleks Krotoski's piece on DML Central. For how the work of this class fits into a wider framework of mischief making, listen to Aleks's "Digital Human" show on BBC 4 radio from April 1, 2013.]

[NB #3: The Chronicle of Higher Education did a follow story on this post. Read it here if you can get past their paywall.]

[NB#4: This post was republished by the London School of Economics' "Impact of Social Sciences" blog on April 10, 2013.]

Posted in Posts | Tagged , , , | 15 Comments

Can You Tell a Book By Its Cover?

It won’t be long (one month, actually) before Teaching History in the Digital Age is available. But the cover has now appeared on the Michigan Press website and I’m very pleased with the result.

9780472118786

Posted in Posts | Tagged , | Leave a comment