Classification of Business Documents DITA BusDocs Subcommittee Meeting 21 January 2008 Presentation with Notes from the Focus Group Meeting of 14 Jan 2008.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Writing an Effective Essay
Year of Wonders Reading and Responding
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Preparing Business Reports
Chapter 4 Enterprise Modeling.
SYSTEM ANALYSIS & DESIGN (DCT 2013)
Systems Analysis and Design 9th Edition
Use-case Modeling.
WEBQUEST Let’s Begin TITLE AUTHOR:. Let’s continue Return Home Introduction Task Process Conclusion Evaluation Teacher Page Credits This document should.
1 Introduction to System Engineering G. Nacouzi ME 155B.
Object-Orientated Design Unit 3: Objects and Classes Jin Sa.
Chapter 4.
Left click or use the forward arrows to advance through the PowerPoint Upon clicking, each section of the article will be highlighted one by one Read.
Main challenges in XML/Relational mapping Juha Sallinen Hannes Tolvanen.
XML, DITA and Content Repurposing By France Baril.
Graphics and visual information English 314 Technical communication Note: To hide or reveal these lecture notes, go to VIEW and click COMMENTS. This lecture.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
The IMRaD Structure Dr. Lam TECM Why is this important? Your project, duh Consumers of research You form opinions based on research (whether you.
Signposting L 5 Ing. Jiří Šnajdar
ESSAY WRITING Character Analysis. Choosing a topic Choose one of the main characters in your Independent novel  protagonist or antagonist Consider that.
The IMRaD Structure Dr. Lam TECM Why is this important? Your project, duh Consumers of research You form opinions based on research (whether you.
Understanding Close Reading Agenda Understanding the Unit: I. Introductory Analogy II. Questioning the Text  Topic, Information and Ideas INTRODUCTION.
8:15 AM Tuesday September 15, 2009 Karl Frank, Point of Contact for Constellation Projects Establishing IV&V Expectations Diagrams for illustrative purposes.
Writing a Discussion Section. Writing a discussion section is where you really begin to add your interpretations to the work. In this critical part of.
Put the Title of the WebQuest Here A WebQuest for xth Grade (Put Subject Here) Designed by (Put Your Name Here) Put Your Address Here Put some interesting.
Put the Lesson Title Here A webquest for xth grade Designed by Put your You may include graphics, a movie, or sound to any of the slides. Introduction.
ISSAI 4300 Compliance Audit in the Context of Courts of Accounts 11th Meeting of the INTOSAI Compliance Audit Subcommittee September 2013.
Interaction Modeling. Introduction (1) Third leg of the modeling tripod. It describes interaction within a system. The class model describes the objects.
CountryData Technologies for Data Exchange SDMX Information Model: An Introduction.
Left click or use the forward arrows to advance through the PowerPoint Upon advancing, each section of the article will be highlighted one by one Read.
Organizing Your Information
Design Patterns in Java Chapter 1 Introduction Summary prepared by Kirk Scott 1.
PREPARING REPORTS CoB Center for Professional Communication.
General EAP writing instruction and transfer of learning Mark Andrew James Arizona State University
16-1 Chapter 16 Analyzing Information & Writing Reports   Analyzing Data   Choosing Information   Organizing Reports   Seven Organization Patterns.
How to write a professional paper. 1. Developing a concept of the paper 2. Preparing an outline 3. Writing the first draft 4. Topping and tailing 5. Publishing.
Going Deeper with Mark Twain A WebQuest for 10th Grade Composition Designed by Sandy Schaufelberger Wes-Del High School, Gaston IN
Critical Thinking Lesson 8
Query Health Concept-to-Codes (C2C) SWG Meeting #11 February 28,
Chapter 4 enterprise modeling
AIMS: writing process, research skills Review in class research project Parts of an essay –Lecture/notes –Handouts –Application Homework –Rewrite introduction.
The Peer Review Process in graduate level online coursework. “None of us is as smart as all of us” Tim Molseed, Ed. D. Black Hills State University, South.
English Language Services
Textbook Recommendation Reports. Report purpose u Starts with a stated need u Evaluates various options –Uses clearly defined criteria –Rates options.
Internal Assessment IB History.
+. + Close Reading & Annotation Or: Here’s what you’re going to do with the text so you can answer the questions later.
ISSAI 400 Compliance Audit Subcommittee
Systems Analysis and Design 8th Edition
Definition Essay WIT Comp 2. Definition A definition essay is an essay that defines a word, term, or concept. In this essay you should not define a term.
Principals of Research Writing. What is Research Writing? Process of communicating your research  Before the fact  Research proposal  After the fact.
1 Technical Communication A Reader-Centred Approach First Canadian Edition Paul V. Anderson Kerry Surman
OASIS DITA for Business Documents Sub-committee Highlights of Work to Date Focus Areas and Future Work Michael Boses Jul
Writing Exercise Try to write a short humor piece. It can be fictional or non-fictional. Essay by David Sedaris.
DITA: Not just for Tech Docs Ann Rockley The Rockley Group.
1 The tree data structure Outline In this topic, we will cover: –Definition of a tree data structure and its components –Concepts of: Root, internal, and.
OUTLINE HELP. Introduction paragraphs MUST include A hook/thematic opening line (as per packet) Transitional sentence connecting your hook to your textual.
1 Annotation Framework March Terminology CV - abbreviation for controlled vocabulary CRS - Community Review System (a collection within DLESE)
Expanding the Notion of Links DeRose, S.J. Expanding the Notion of Links. In Proceedings of Hypertext ‘89 (Nov. 5-8, Pittsburgh, PA). ACM, New York, 1989,
Chapter 9 Visual Media Copyright © 2014 Pearson Education, Inc. publishing as Prentice Hall 1Chapter 9 -
Understanding Close Reading Agenda Approaching the Text INTRODUCTION TO THE UNIT.
Report writing skills A Trade union training on research methodology, TMLC, Kisumu, Kenya 6-10 December 2010 Presentation by Mohammed Mwamadzingo,
© 2012 Cengage Learning. All Rights Reserved. This edition is intended for use outside of the U.S. only, with content that may be different from the U.S.
Lesson 4 Basic Text Formatting. Objectives ● I ● In this tutorial we will: ● Introduce Wiki Syntax ● Learn how to Bold and Italicise text, and add Headings.
Unit 4 Introducing the Study.
Parts of an Essay Ms. Ruttgaizer.
Parts of an Essay.
Supporting your Argument with research
Put the Lesson Title Here
Presentation transcript:

Classification of Business Documents DITA BusDocs Subcommittee Meeting 21 January 2008 Presentation with Notes from the Focus Group Meeting of 14 Jan 2008

Meeting Summary Classification focus group members include Howard Schwartz, Eric Severson, Amber Swope, and Michael Boses. Howard was not able to attend the meeting due to travel Michael presented the enclosed PowerPoint as a starting point for the discussion Discussion was captured and incorporated into the PowerPoint under the heading, “Notes” Next steps: – Eric will work on a preliminary mapping of a limited number of document types that illustrate the mapping – The focus group will present a summary of what we have discussed to the full subcommittee during the January 21 meeting

Introduction - 1 The need for a classification system for business documents arises from: – The desire to indentify the specific document set that is being addressed by the subcommittee, as well as the rationale behind that selection – The ability to further analyze the document set using a refinement of the same characteristics used to classify them

Introduction - 2 What type of characteristics are important? – Documents can be classified in many ways. The most common way used is a semantic classification based upon the textual content of the document – The subcommittee approach is different since we want to classify documents based upon their structural characteristics since it is the structure of business document that will need to be harmonized with DITA

Potential Structural Characteristic to Consider when Classifying Is it a narrative? Narrative complexity Document length Tree depth Tree balance Table frequency Table complexity Graphic frequency XML vocabularies Transclusions – Notes: Eric feels that repetitive structures will be an important characteristic – Amber suggests that whether a document references external system data might be important as well Howard – Understanding the business purpose might be important as a characteristic. Eric– could be interesting but maybe not the driver We will capture the information as part of the analysis Ann – It’s possibly a different level of classification Josef– translation should not in itself change the structure, but perhaps what we want to look at is documents with variants in them. Howard--Business documents will have different challenges than technical publications

Higher level model – Structures that are not linked to semantics that can then be correlated to documents for different usage – The end-game is to say where does DITA fit in? – “semantic neutral” way of classifying – Apply the general to specific usages later – Eric– concept, task, and reference were specializations to begin with—are they even meaningful for business documents? – Howard-- Informational, vs. persuasive? Intent or purposes—does it correlate to structure—does it dictate structure, does it matter for reuse?

Business DocumentsTabular ReportsFormsNarrative Documents First-level Classification Notes: while the concept is good, none of us is happy with the terminology. In particular, we need to come up with an alternative for Forms. The purpose of this slide is to say that there are business documents that are out-of-scope. This is our first level?

FormsNarrative Documents Subject Document Form-Narrative Scale Metric: – Ratio of total elements to total words – Notes Eric: What is a form? How do we keep from excluding documents with structures that we need to address, because we called a “form”? Something to describe “form” that isn’t based upon its implementation. “XML blurs the distinction between documents and data” – A: Elements are “structural” in nature. We need to define what type of elements we will use to arrive at the ratio

Narrative Document Narrative Density? Tree Depth? Document Length? Most Significant Characteristic? Once we have established that it is a narrative document, what is the next most significant characteristic to examine? – Notes, general agreement with the presentation, that it would be the tree depth of the document

Eric- DITA is trying to apply best practices to writing – is this a fundamental thing about writing or is it just tech pubs? Should there be a more generic task that could be specialized into a tech pubs task? Ann- what we have now is a specialization for tech docs and so it fits—it is possible to start at higher at a more generalized level Interesting that paragraphs have “topic sentence.” The topic sentence may be an important bridge that allows us to introduce the concept of topic based authoring to the business community Business documents are maturing—are tech docs more mature? Tech docs are most often not read for pleasure and are “random access” information Writing for reuse has a significant impact on how content is written—does it invalidate some of our common business document structures?

Types of reuse: – The ability to flow one person’s content into another person’s content and have it hold up contextually – The ability to have content presented as a result of a query or aggregation and have it hold its integrity as a single unit of information – Will the message change depending upon how someone arrived at it—either in the original context or by itself? All this ties back to the maturity model that will help organizations move to a “best practice” approach to authoring. This will give us something valuable for business and acceptable to the DITA community. Now our classification can also correlate to this issue.

Narrative DocumentFlat Document Highly Nested Document The Need to Quantify Hierarchy The author of the highly nested document is using structure to communicate semantics. Hierarchical Scale – Ratio of total transitions in hierarchy to total elements Notes: General agreement. No specific comments

Flat DocumentLight NarrativeDense Narrative Highly Nested Document Light NarrativeDense Narrative Qualifying Narrative Density Narrative Density Scale – Average paragraph length for paragraphs > 100 characters – Notes: no specific comments

Recap of Characteristic Importance Is it a Narrative? Narrative complexity Document length Tree depth Tree balance Table frequency Table complexity Graphic frequency XML vocabularies Transclusions Notes: Eric- we need to address: repetitive structures (i.e., topics) and constrained structures. What do repetitive structures and constrained structures mean to DITA? Michael: the number of paragraphs per section seems important—but what is a section?

Notes: Additional Discussion Discussion of an SOP as it relates to repeating structures – One approach to an SOP is for it to be very verbose, with only 4-5 “structures” – Another approach is for it to be very terse, with 20 structures that add semantics to the content. The goal of XML in general when applied to narrative documents, is to imply more and more of the semantics through the document structure “Document linearity with repeating structures” as a structural characteristic provides “random access” to the information in the document. Repetitive structures appear to be as important a characteristic as the tree depth, if not more. Repetitive structures to a degree indicate whether the document is a reference or something intended to be read end-to-end? Repetitive structures cause a document to actually be a collection of mini- documents, each that could stand alone