Download presentation
Presentation is loading. Please wait.
1
HarperCollins
2
Agenda Content Creation Process What is DITA?
What is DITA Open Toolkit? What does RSuite do? Demo Manuscript to ICML: Word -> DITA -> ICML Workflow Engine InDesign Code – Java, XSLT, XQuery, Java APIs “Groundbreaking” Topic
3
Current Book Composition Process
Step 1: Editorial Manuscript docx Step 2: Composition InDesign indd
4
New Book Composition Process
Step 1: Editorial Manuscript docx Step 2 Generate DITA XML xml Step 3 Generate ICML icml Step 4: Composition InDesign indd Download ICML Transform 1 Transform 2
5
What is DITA? Darwin Information Typing Architecture
Is an XML Data Model for Authoring and Publishing Topic Oriented Each Topic is a separate XML file DocBook is Book Oriented, more Complex, One Big XML file DITA Initial Spec in 2001 DocBook Initial Spec in 1991 Core DITA Topic Types: Concept Task Reference Specialization: Subtyping – New Topics derived from existing
6
What is DITA? Topic must have at least: Id attribute in root, title, and body. DITA MAP stitches topics together.
7
What is DITA? Eliot Kimber Norm Walsh Post from October 2005:
Norm Walsh Post from October 2005: Four key technical differences where DITA may be “better” than DocBook: A topic-oriented authoring paradigm. A cross-referencing scheme that's more practical than XML's flat ID space. SGML's conref, reinvented. An extensibility model based on "specialization". Demo DITA using Oxygen
8
What is DITA Open Toolkit?
Open-source publishing system for DITA Provides multi-channel output Uses Pipeline Processing Approach using: Java XSLT Rendering Engine (FOP, RTF, etc.) DITA 4 Publishers Demo DITA OT using command line and Oxygen
9
What does RSuite do? Centralized Repository for “all” artifacts
Provides: Workflow DITA Transforms Manuscript to DITA DITA to ICML Multi-channel Output – PDF, ePub3, InDesign Role Based Security Distribution: FTP to Commercial Printer E-Commerce Sites
11
SAN Drives RSuite Tomcat Server MarkLogic Node 1 MarkLogic Node 2
500 GB – 100 GB / Disk Non XML Disk 1 Non XML Disk 2 Non XML Disk 5 Non XML Disk 4 Non XML Disk 3 RSuite Tomcat Server MySQL Disk Temp Directories 1. XSLT Transforms 2. File Uploads MarkLogic Node 1 4 CPU - 2 Core / CPU Disk 1 - Forest 1 600 GB Disk 2 - Forest 2 Disk 3 - Backup 300 GB MarkLogic Node 2 4 CPU - 2 Core / CPU Disk 1 - Forest 3 600 GB Disk 2 - Forest 4 Disk 3 - Backup 300 GB MarkLogic Node 3 4 CPU - 2 Core / CPU Disk 1 - Forest 5 600 GB Disk 2 - Forest 6 Disk 3 - Backup 300 GB
12
1 2 3 SAN Drives RSuite Tomcat Server MarkLogic Node 1
500 GB – 100 GB / Disk Non XML Disk 1 Non XML Disk 2 Non XML Disk 5 Non XML Disk 4 Non XML Disk 3 Feature Request: Use XA Transaction: File Copy MySQL Update Metadata Update RSuite Tomcat Server MySQL Disk Temp Directories 1. XSLT Transforms 2. File Uploads 1 2 MarkLogic Node 1 4 CPU - 2 Core / CPU Disk 1 - Forest 1 600 GB Disk 2 - Forest 2 Disk 3 - Backup 300 GB MarkLogic Node 2 4 CPU - 2 Core / CPU Disk 1 - Forest 3 600 GB Disk 2 - Forest 4 Disk 3 - Backup 300 GB MarkLogic Node 3 4 CPU - 2 Core / CPU Disk 1 - Forest 5 600 GB Disk 2 - Forest 6 Disk 3 - Backup 300 GB 3
13
RSuite Demo? Upload Transforms PDF, ePub ICML to InDesign
MarkLogic Config
14
Code? Java jBPM – Biz Process Management Framework
Ivy – to manage plugin dependencies Ember.js XQuery Groovy DITA-OT XSLT Plugins RSuite API Docs
15
Groundbreaking Opportunity
Unleash the Tombstones! All Content can be reused for product development
16
DITA to RDF Transform! Semantically Linked DITA
Link to Internal and External Content DBPedia: NY Times Dublin Core US Census Semantic Links create a network of Knowledge Enables Inferencing (ML8) Uses MarkLogic Triple Index
17
Why RDF? RDF compliments DITA Contains facts about DITA topics
Facts are stored in the Triple Index Facts are used to: Link internal and external documents Derive other facts (inferencing) Provide higher quality search result RDF is efficient storage and linking mechanism MarkLogic turns RDF into Triples
18
Why Triples? Triple is a Subject-Predicate-Object (SPO) structure used to represent a fact. Lets computers derive facts from other facts without human involvement. Example: Ted lives in Chicago, Illinois Ted lives near Wrigley Field Ted has a roommate called Sam Ted and Sam go to Wrigley Field to watch games From these facts: Sam lives in Chicago Wrigley Field is in Chicago, Illinois Chicago is in Illinois Sam and Ted both live in the US Etc…
19
How to add Triples? Facts need to be curated. Data provenance
Editors can add facts to DITA Topic Docs. New world of Semantic Publishing Eroni Kumana
20
Profiles in Courage Example
Add Facts to Chapter 4 DITA XML: “Profiles in Courage” Primary ISBN value is John F. Kennedy is the Author Of “Profiles in Courage” John F. Kennedy is a Person John F. Kennedy was at the Solomon Islands in August 1943 Eroni Kumana is a Person Eroni Kumana was at the Solomon Islands in August 1943 Eroni Kumana rescued John F. Kennedy Eroni Kumana is mentioned in Chapter 4, Profiles in Courage Semantic Event – NY Times News Feed Eroni Kumana died on August 2, 2014 Event Triggers Automatic Pub: CMS automatically publishes “Profiles in Courage” web page with snippet to the specific Chapter referencing Eroni Kumana. New web page also has link to like and/or purchase book.
21
Generate Transient RDF
Book Process Steps Step 1: Editorial Manuscript docx Step 2 Generate DITA XML xml Step 3 Generate ICML icml Step 4: Composition InDesign indd Download ICML Transform 1 Transform 2 Word 2 DITA DITA 2 ICML Step 3 Generate Transient RDF rdf ML Triple Index Transform 3 DITA 2 RDF
22
SAN Drives RSuite Tomcat Server MarkLogic Node 1 MarkLogic Node 2
500 GB – 100 GB / Disk Non XML Disk 1 Non XML Disk 2 Non XML Disk 5 Non XML Disk 4 Non XML Disk 3 RSuite Tomcat Server MySQL Disk Temp Directories 1. XSLT Transforms 2. File Uploads MarkLogic Node 1 4 CPU - 2 Core / CPU MarkLogic Node 2 4 CPU - 2 Core / CPU MarkLogic Node 3 4 CPU - 2 Core / CPU Index Index Index Triples Disk 1 - Forest 1 600 GB Triples Disk 1 - Forest 3 600 GB Triples Disk 1 - Forest 5 600 GB Disk 2 - Forest 2 600 GB Disk 2 - Forest 4 600 GB Disk 2 - Forest 6 600 GB Disk 3 - Backup 300 GB Disk 3 - Backup 300 GB Disk 3 - Backup 300 GB
23
De-Silo-ize Custom APIs are used to communicate between silos. DAM
Web Host Provider ISBN DB ebook Store Published Docs CMS
24
Hub Spoke – No Silos Uses standardized RDF “connectors” to communicate. DAM ISBN DB Web Host Provider ebook Store Published Docs CMS
25
Call To Action Contribute to DITA RDF Project Build a Knowledge Engine
Build a Knowledge Engine
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.