ETD2004, June 3-5 2004 University of Kentucky, Lexington Structured ETDs at the Document and Publication Server of Humboldt University From DTD generation.

Slides:



Advertisements
Similar presentations
/ 1Online Educa Conference 2008, Berlin Learning Objects and Resources Mega Content Transformation with Open Source Educational Content Project.
Advertisements

28 March 2003e-MapScholar: content management system The e-MapScholar Content Management System (CMS) David Medyckyj-Scott Project Director.
Delivering textual resources. Overview Getting the text ready – decisions & costs Structures for delivery Full text Marked-up Image and text Indexed How.
Iha Institut für Hygiene und Arbeitsphysiologie dLCMS dynamic Learning Content Management System Samuel Schluep,
CLEARSPACE Digital Document Archiving system INTRODUCTION Digital Document Archiving is the process of capturing paper documents through scanning and.
Setting Up Information Portal Irwan Sampurna C-CONTENT 23 May 2006.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Electronic Theses and Dissertations: Benefits, Issues, and the University of Waterloo Approach
METS: An Introduction Structuring Digital Content.
HTML5 ETDs Edward A. Fox, Sung Hee Park, Nicholas Lynberg, Jesse Racer, Phil McElmurray Digital Library Research Laboratory Virginia Tech ETD 2010, June.
XML/EDI Overview West Chester Electronic Commerce Resource Center (ECRC)
An Introduction to XML Based on the W3C XML Recommendations.
Building of the Digital library of Brno University of Technology Barbara Šímová /
ARCHIMÈDE Presented by Guy Teasdale Directeur, Services soutien et développement Bibliothèque de l’Université Laval CARL Workshop on Institutional Repositories.
Adaptability of learning objects by appropriate knowledge representation Anastas Misev Institute of Informatics Faculty of Natural Science and Mathematics.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Bookshelf Leafing through XML NLM Journal Article Tag Suite Conference 2010 Martin Latterner and Marilu Hoeppner National Center for Biotechnology Information.
Formation of ETD‘s and releated issues 6th ETD Conference May 20 – , Berlin Dr. Nikola Korb, Co-ordination Agency DissOnline Deutsche Bibliothek.
Humboldt University: A workflow model for digital theses and dissertations ETD A workflow model for digital theses and dissertations Developments.
The Graduate School Thesis/Dissertation Formatting Workshop URI Graduate School March 31, 2011 Memorial Union, URI Al Gerheim, PhD
The Graduate School Thesis/Dissertation Formatting Workshop URI Graduate School October 6-7, 2010 Memorial Union, URI Al Gerheim, PhD
1 A Manager’s Guide to Converting XML to Structured FrameMaker Doug Martin.
WWW and Internet The Internet Creation of the Web Languages for document description Active web pages.
How to Create Accessible PowerPoint Presentations Elizabeth Tu and Thayer Watkins April, 2010.
Creating Reusable Content
XML, DITA and Content Repurposing By France Baril.
HTML Comprehensive Concepts and Techniques Intro Project Introduction to HTML.
Dienst Distributed Networked Publishing Carl Lagoze Digital Library Scientist Cornell University.
Cyberthèses and Cyberdocs The EADI Information Management Working Group's 27th annual meeting – Dublin – September 11th and 12th Martin Sévigny – AJLSM.
8th International Symposion on Electronic Theses and Dissertations, ETD2005, Sydney SCOPE An XML Based Publishing Platform Uwe Müller, Manuel Klatt Humboldt-Universität.
Luc Audrain Hachette Livre Head of digitalization
Creating a Basic Web Page
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
Introduction to XML Eugenia Fernandez IUPUI. What is XML? From the World Wide Web Consortium (W3C) The Extensible Markup Language (XML) is the universal.
Open Textbooks and Electronic Publishing Formats/Standards Arctic Virtual Learnng Tools
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
Extending the Scope of Learning Objects with XML Bill Tait COLMSCT Associate Teaching Fellow The Open University ALT-C Conference Sep 2007.
1 Web Developer Foundations: Using XHTML Chapter 2 Key Concepts.
EXtensible Markup Language (XML) and Documentation --ManojBokil -- Manoj Bokil.
THINK LEARN LEAD LINK Flinders University Web Redevelopment An overview May 2006 Antonia Malavazos, Web Project Officer.
A Basic Web Page. Chapter 2 Objectives HTML tags and elements Create a simple Web Page XHTML Line breaks and Paragraph divisions Basic HTML elements.
The DiVA System: Current Status and Ongoing Development Uwe Klosa Electronic Publishing Centre, Uppsala University, Sweden Eva Müller.
LADL2007 Workshop, 20 Sep 2007, Budapest, HU Polyxeni Arapi Nektarios Moumoutzis Manolis Mylonakis George Stylianakis George Theodorakis {xenia, nektar,
XML and Digital Libraries M. Zubair Department of Computer Science Old Dominion University.
1 Reference Linking in Project Euclid …with some thoughts on the preservation of digital collections. A presentation at the Workshop on Linking and searching.
10/18/2015 NORTEL NETWORKS CONFIDENTIAL – FOR TRAINING PURPOSES ONLY Global Documentation Evolution System Overview and End-to-End Process Training.
1 J-STAGE Electronic Journal Publication & Dissemination Center
P. Schirmbacher Humboldt-Universität zu Berlin The Changing Process of Scholarly Publishing or the Necessity of a New Culture of Electronic.
DITA Single Source technology. What is Single Source? Single source technology is a concept of publishing documents when same content can be used in different.
CEAL 2003 XML for CJK Wooseob Jeong School of Information Studies University of Wisconsin - Milwaukee.
Access Chapter 8- Integrating Access with the Internet and other Programs.
Using XML to store Descriptive Metadata Richard Murphy Rosarie O’Riordan Central Statistics Office Ireland.
Advanced Technical Writing 2006 Session #4. Today in Class… ► Meet with your editorial team, refine/post deliverables ► Send URL for deliverables to Bill.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Document Computing Technologies for Managing Electronic Document Collections Ross Wilkinson... [et al.] Circulation Counter [RES3H] ZA4080.D
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Managing ETDs with Associated Complex Digital Objects Gabrielle V. Michalek Director, Scholarly Publishing, Archives and Data Services Carnegie Mellon.
Electronic Theses and Dissertations: A Status Report for 2001 Paul A. Soderdahl University of Iowa Libraries IACON 2001, Buena Vista University June 1,
Updating image To update the background image: Go to ‘View’ Select ‘Slide Master’ Select the page with the image Right click on the image and select ‘Change.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Beyond HTML: Extensible Markup Language (XML)
© 2005 KPIT Cummins Infosystems Limited We value our relationship XML Publisher Prafulla Kauthalkar RJTSB – Oracle Apps Consultant We value our relationship.
Chapter 1 Introduction to HTML
A look at the digital initiatives of Laval University Library
XML QUESTIONS AND ANSWERS
Improving Braille accessibility and personalization on Internet
Markup of Educational Content
Web Engineering.
VI-SEEM Data Repository
Managing ETDs with Associated Complex Digital Objects
Presentation transcript:

ETD2004, June University of Kentucky, Lexington Structured ETDs at the Document and Publication Server of Humboldt University From DTD generation to XML conversion: Uwe Müller Humboldt University, Berlin Electronic Publishing Group

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin Background Humboldt University: 800 – dissertations / year Germany: duty to publish dissertations –traditional methods: publishing house microfiche 40 … 200 printed copies (depending on faculty regulations) Humboldt U.: not mandatory to submit an ETD ~ ¼ dissertations published electronically XML as central strategy

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin Why XML? Standardized format Long term preservation easily convertible to –presentation formats (HTML, PDF) –other XML structures qualified full text retrieval contains structural and contextual information – in a machine readable format HTML digital signature PDF digital signature Office document digital signature XML

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin XML: Restrictions to deal with XML source does not contain layout information rather linear structure XML is not used as Authoring System –authors use their 'own' systems Microsoft Word LaTeX Open Office / Star Office Framemaker Word Perfect

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin How to overjump the gap? get the authors where they are … instructions and guidelines for authors –usage of style files (e.g., dissertation-hu.dot) –manuals, support hotline, regular courses different conversion processes –SGML author (plug in for MS Word <= 97) –Open Office / Star Office exploit genuine XML format –MS Office 2003  XML according to DiML DTD –common pitfalls: tables, pictures

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin Conversion Process Using Open Office Open Office example.doc example.sxw (zip file). content.xml example_stl.xml example.xml front.xml chapter1.xml chapter2.xml chapter3.xml example.html *.gif *.jpg front.html chapter1.html chapter2.html chapter3.html

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin Principal Structure of a DiML document..title...author...abstract bibliography...appendix...vita...

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin From flat structure to Hierarchy only two types of styles in Word –paragraph styles –character styles e.g., in case of th first occurring Heading 1 paragraph style the converter has to know –Heading 1 is the beginning of a chapter –Heading 1 implies a head element –the element chapter can only occur in body Introduction

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin One Core – Multiple Views HTML generation (static or dynamic) –performance problems with XSLT and huge documents –solution: division of XML sources into components (easier and fast to process) PDF + Print on Demand ( Current problems –changing Office systems and versions ongoing implementations and adaptations necessary but: might be restricted to XSL coding

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin Towards a universal DTD? DiML – originally taken from an SGML DTD at Virginia Tech ("ETD"), –already many elements (> 100) –combines elements of different description levels –extended and adapted to local needs special requirements from several departments (e.g., literature / dramatics, humanities, geography, …) necessity to include external DTDs (e.g., CALS-Table, MathML, MusicML, …) publication types other than theses and dissertations –conference proceedings, electronic journals, other series, … first approach: extend DTD aiming at a universal 'mega' DTD –problems: complexity, difficult maintenance other possibility: create a completely new DTD for each purpose –loss of interoperability

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin Modular DTD Approach idea: individually adapted DTDs 1.split up DTD into modules, such as –text, structure, citation, dramatics 2.handle external DTDs as modules as well, e.g., –MathML, MusicML, CALS-Table 3.recombine a DTD out of user selected modules result a.a DTD with only the needed elements and modules b.individual reference and sample documents

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin Modular DTD Approach: Benefits modules are easily maintainable –distributed development –version numbers for each module reusability –define (several) styles for each module –reference information for each module support different languages get a DTD that exactly fits your needs

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin DTDSys: Principal Architecture modules: small packages of elements belonging to each other stored in separate files in the DTDBase include metadata, e.g., descriptive information, version numbers, and dependences to other modules DTDSys generates DTD and reference files using –XSL / XSLT –Java –Web Interfaces

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin Modules and Dependences text br, em, strong, sup, sub, u, tt, pre commonp, head, caption, url, name, foreign… structurechapter, section, subsection… citationquotations and references documentspage numbers, footnotes, endnotes, … dimlfront, body, back, abstract…

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin reference. DTD Generation Process DTDBase dependences.html selection.xmlfull-dtd.xml xdiml.dtd dtd-reference.xml p.php chapter.php module-text.xml XSL Java+XSL XSL including element info description dependences

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin Outlook SCOPE = Service Core for Open Publishing Environments –development of Publication Components (authoring tools, conversion mechanisms, layout and style definitions) –management system to maintain versions and dependences –publication system –workflow component Long Term Preservation activities –Implementation of OAIS reference model –Sun Center of Excellence

ETD2004, June University of Kentucky, Lexington From DTD generation to XML conversion: Structured ETDs at Humboldt's EDoc Server Uwe Müller, Electronic Publishing Group, CMS / UB Humboldt University, Berlin Thanks to Sabine Henneberger, Jakob Voß, Matthias Schulz Thank you! Questions?