Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Clumps and Runners John Unsworth Bamboo Workshop, Tucson AZ, January 13, 2009.

Similar presentations


Presentation on theme: "UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Clumps and Runners John Unsworth Bamboo Workshop, Tucson AZ, January 13, 2009."— Presentation transcript:

1 UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Clumps and Runners John Unsworth Bamboo Workshop, Tucson AZ, January 13, 2009

2 2 Footer Exchange of ideas: “A substantial amount of the work of the humanities is carried out in the form of the exchange of ideas.” -- Anthony Cascardi “No ideas but in things” -- William Carlos Williams

3 3 Footer Things in the humanities: the whole variety of material cultural heritage, from images to sound to text to maps to architecture, landscape, and all manner of carrier media... including sheep DNA in parchment.

4 4 Footer Representing things... a problem usually discussed at a high level of abstraction, in digital humanities. I'd like to discuss it at a very prosaic level, instead, with text objects as the example.

5 5 Footer TEI "The Text Encoding Initiative Consortium is an international organization whose mission is to develop and maintain guidelines for the digital encoding of literary and linguistic texts. The Consortium publishes the Text Encoding Initiative Guidelines for Electronic Text Encoding and Interchange: an international and interdisciplinary standard that is widely used by libraries, museums, publishers, and individual scholars to represent all kinds of textual material for online research and teaching." http://www.tei-c.org/About/index.xml

6 6 Footer Lessons of Google? But why do any mark-up at all? Isn't the lesson of Google that all that really matters is the word, and tagging is superfluous? Not if you want to select out of Google Books Fiction Books about England Books published in the 1900s Books written by women

7 7 Footer But structural markup? Really? It turns out that, more often than not, paragraphs and chapters, lines and verses, are meaningful units of composition, and therefore they can be meaningful units of analysis. It can also be important to recognize the difference between the main text of a work (the chapters in a novel, say) and the paratext (table of contents, preface, running headers, etc.), especially if you're asking statistical questions about the text.

8 8 Footer Surely we don't need to mess with the words, though...... at the word level, it can be helpful (in a novel, for example) to ignore proper names, for example, so as to see more clearly what's going on with other kinds of words--but even at the word level, the information about the words is carried, ultimately, in tagging.

9 9 Footer Interoperability The TEI community believes... that "people use TEI in many different contexts for many different purposes to encode many different kinds of material." But they also believe that this somehow, in some universe, achieves the TEI's stated goal of interoperability. It really doesn't. So if people are in fact encoding things in all sorts of different ways and for different purposes, then why shouldn't I chuck it all and roll my own? You say that's it better not to go it alone. (Steve Ramsay, in email)‏

10 10 Footer Interoperability? As we are coming to the end of this project -- and returning to an earlier exchange of views on the Monk list about interoperable texts-- I can't refrain from pointing to the large amount of needless and heedless divergence. There is good and bad news about it. The bad news is that it has caused a lot of work. The good news is that a very high percentage of problems can be solved quite satisfactorily by supplementary conventions to the content rules of elements. If, for instance, the people who slapped the Level 4 Guidelines together had spent two hours about making recommendations what to do or not to do about soft hyphens at the end of a line or page when you encode a text according to Level 4 Guidelines, we'd have a lot less grief. And so it goes with a lot of other little stuff. (Martin Mueller, in email)‏

11 11 Footer Clumps and Runners So, we have data in clumps—in collections that are curated and hosted by libraries, publishers, and others; what we need are the runners that connect those clumps, and what we'll discover when we have them is that data doesn't move between clumps very successfully. That's a problem that nobody is really dealing with, and that's a role for Bamboo. I'm not recommending that Bamboo become a standards group, but rather that Bamboo attend to the actual problems of data interchange and interoperability, in actual data domains, for actual research projects, and that it collect and synthesize the experience of actual practice and turn it back to the stewards of content, to reduce “needless and heedless divergence.”


Download ppt "UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Clumps and Runners John Unsworth Bamboo Workshop, Tucson AZ, January 13, 2009."

Similar presentations


Ads by Google