What can you do with texts that are in a digital format? « Digital Scholarship in the Humanities

Digital Scholarship in the Humanities:
I've had a longstanding, friendly debate with a colleague about whether it is sufficient to provide page images of books, or whether text should be converted to a machine- and human-readable format such as XML. She argues that converting scanned books to text is expensive and that the primary goal should be to provide access to more material. True, but converting books into a textual format makes them much more accessible, allowing users to search, manipulate, organize, and analyze them. Here's my summary of what you can do with an electronic text. Most of these advantages are pretty obvious, but worth articulating.
It's not digital text if it's an image file. It's just an image, that might contain anything at all. Vannevar Bush's Memex was an idea for a text storage-and-retrieval system that worked by storing and linking microfilm images of pages of text, but his vision was purely analog. Page images do provide a certain amount of information, and today it's not too hard to find tools that convert page images to text, but an archival project is incomplete if the digitization process stops at simply supplying images of the the material to be archived.

Leave a comment


Type the characters you see in the picture above.

Recent Related Entries

New AP Stylebook Cuts the 'Malarkey,' Brings in the 'WMD'
Editor and Publisher:The newest version of the Associated Press Stylebook is available, and if you follow it, "WMD," "iPhone" and "anti-virus" are in, while "barmaid," "blue blood" and "malarkey" are out. Those are just some of the changes to its...

Inside Google Book Search: U.S. copyright renewal records available for download
Inside Google Book SearchFor U.S. books published between 1923 and 1963, the rights holder needed to submit a form to the U.S. Copyright Office renewing the copyright 28 years after publication. In most cases, books that were never renewed are...

2008 Kids & Family Reading Report
From Scholastic, a report that shows books still appeal to kids. Does this mean that mouse-clicking adults will think of books as childish?A new study released today finds that 75% of kids age 5-17 agree with the statement, "No matter...

Is Google Making Us Stupid?
Nicholas Carr, in The Atlantic:As the media theorist Marshall McLuhan pointed out in the 1960s, media are not just passive channels of information. They supply the stuff of thought, but they also shape the process of thought. And what the...

Pong Ported to the Choose-Your-Own-Adventure Platform
Thanks for the suggestion, Matt....