Presentation on theme: "Harmonizing Semantics in E-Government"— Presentation transcript:
1 Harmonizing Semantics in E-Government Presentation to the Ontolog-Forum (http://ontolog.cim3.net)Brand L. NiemannU.S. Environmental Protection Agency Enterprise Architecture TeamCIO Council’s Architecture and Infrastructure Committee (AIC)Co-Chair, Semantic Interoperability Community of Practice (SICoP)CIO Council’s Best Practices Committee (Knowledge Management Working Group)April 22, 2004
2 A Little HistoryLed a Team That Won the Special Award for Innovation with XML and VoiceXML Web Services from Mark Forman and the Quad Council at FOSE, March 2002.Led the CIO Council XML Web Services Working Group from August 2002-October 2003:TopQuadrant led the Semantic Technologies for eGovernment Pilot.TopQuadrant helped organize the very successful Semantic Technologies for eGovernment Conference at the White House Conference Center, September 8, 2003.The TopQuadrant Pilot and the CIO Council’s Knowledge Management Working Group (Best Practices Committee) Helped Start the new Semantic Interoperability Community of Practice (SICoP).The XML Web Services Working Group Pilots Supported the Development of the:Federal Enterprise Architecture’s (FEA) Data and Information Reference Model and Its Data Management Strategy; and theGovernment Enterprise Architecture Framework (GEAF) of the CIO Council’s Architecture and Infrastructure Committee (AIC) Governance Subcommittee.
3 Organizational Relationships Industry AdvisoryCouncil (IAC)U.S. CIO CouncilOMB - FEAPMOEnterprise ArchitectureSpecial Interest GroupArchitecture &InfrastructureCommitteeIT WorkforceConnectionsBest PracticesCommitteeWGs and CoPsSubcommittees:GovernanceComponentsEmerging TechnologiesSemanticInteroperabilityCommunity ofPracticeChiefArchitectsForum
4 Some Upcoming EventsCollaboration Expedition Workshop #31, April 28th, National Science Foundation, Ballston, Virginia:Joint Workshop with SICoP on Multiple Taxonomies:SeeCollaboration Expedition Workshop #32, May 11th, National Science Foundation, Ballston, Virginia:Workshop on Emerging Technology Innovations in Software Component Development, Reuse, and Management – Applications to Government Enterprise Architecture (e.g. the new Chief Architects Forum CoP):SICoP Monthly Meeting #2, May 19th, MITRE, Mclean, Virginia:Progress Reports on White Paper Modules (3), Collaboration Tools, Discussion of Common Upper Ontologies, etc.See andFourth Quarterly Emerging Technology Components Conference, June 3rd, MITRE, McLean, Virginia:Populating the Service Grid with Service Components:See
5 An Upcoming EventJoint Workshop with SICoP on Multiple Taxonomies, April 28th:Welcome:Organizer: Michel Biezunski, Coolheads ConsultingThe Semantic Web-What Is This Really About?Renee Lewis, Pensare GroupIncreased Knowledge Sharing and Mission Success: Implementing Taxonomies for NASA:Jayne Dutra, Jet Propulsion LaboratoryMaster and Relational Taxonomies:Kevin Hannon, Independent ConsultantClustering of Search Results With and Without Taxonomies:Raul Valdez-Perez, Vivisimo, Inc.
6 An Upcoming EventJoint Workshop with SICoP on Multiple Taxonomies, April 28th (continued):Semantics, Ontologies, and the Semantic Web:Leo Obrst, The MITRE CorporationHow to Create Many Taxonomies That Integrate Into a Single Enterprise-Wide Taxonomy:Denise Bedford, The World BankOntology Overview:Adam Pease, Independent ConsultantIssues in Negotiating Multiple Semantic Models:LeeEllen Friedland, The MITRE CorporationAccessibility, Usability, and Preservation of Government Information:Eliot Christian, USGS and Chair, Categorization of Government Information Working Group of the Interagency Committee on Government InformationOpen Dialogue:Steven Newcomb, Coolheads Consulting
7 A Past EventSICoP Monthly Meeting #1, April 14th, Army CIO’s Office, Crystal City, Virginia:Part 1 Community Business:Old Business:Minutes and CharterEmerging Products:White Paper On Implementing the Semantic Web:Module 1: Harnessing the Power of Information Semantics (Jie-Hong Morrison, State Department)Module 2: Exploring the Business Value of Semantic Interoperability (Irene Polikoff, TopQuadrant)Module 3: Roadmap for Operationalizing the Semantic Web (Michael Daconta, Smart Data Associates) (Slides 13-14)Army Knowledge Management Conference, August 31-September 2nd, Semantic Web Track (need speakers).Posted at Past Meetings and Presentations, April 14th
8 A Past EventSICoP Monthly Meeting #1, April 14th, Army CIO’s Office, Crystal City, Virginia (continued):Part 2 Building Shared Understanding:Ontologies for Semantically Interoperable Systems, Leo Obrst, The MITRE Corporation (deferred to the next meeting) (Slides 9-12)A Data and Information reference Model (DRM) Registry and Repository Pilot, Brand Niemann, US EPA (deferred to the next meeting)Common Upper Ontology for Cross-Domain Semantic Interoperability, Jim Schoening, The U.S. Army Communications Electonics CommandPart 3 Launching/Building the Supported Community of Practice:Proposed CoP Development Process, Rick Morris, US Army CIO OfficeFacilitated Discussion, Rick Morris and Brand Niemann, Co-Chairs
9 Application Data Tightness of Coupling & Semantic Explicitness Explicit, LooseFarPerformance = k / Integration_FlexibilityModal PoliciesInternetSemantic MappingsSemantic BrokersOWL-SAgent ProgrammingRDF/S, OWLPeer-to-peerSemantics ExplicitnessWeb Services: UDDI, WSDLWeb Services: SOAPXML, XML SchemaDataAppletsCommunityApplicationN-Tier Architecture EAIWorkflow OntologiesSame IntranetConceptual ModelsMiddleware WebEnterpriseData MartsSame Wide Area Network Client-ServerData WarehousesSame Local Area NetworkFederated DBsDistributed Systems OOPSystems of SystemsSame DBMSSame OSSameAddressSpaceSame CPULinkingFrom Synchronous Interaction to Asynchronous CommunicationSame Programming LanguageSame Process SpaceCompiling1 System: Small Set of DevelopersLocalLooseness of CouplingImplicit, TIGHT
10 Dimensions of Interoperability & Integration Our interest lies hereCommunityEnterprise6 Levels of InteroperabilitySystemSemanticApplicationComponentSyntacticStructuralObjectData3 Kinds of Integration0%100%Interoperability Scale
11 Ontology Spectrum: One View strong semanticsIs Disjoint Subclass of with transitivity propertyModal LogicLogical TheoryThesaurusHas Narrower Meaning ThanTaxonomyIs Sub-Classification ofConceptual ModelIs Subclass ofDB Schemas, XML SchemaUMLFirst Order LogicRelationalModel, XMLERExtended ERDescription LogicDAML+OIL, OWLRDF/SXTMSemantic InteroperabilityStructural InteroperabilitySyntactic Interoperabilityweak semantics
12 Ontology Spectrum: One View strong semanticsModal LogicFirst Order LogicProblem: Very GeneralSemantic Expressivity: Very HighProblem: LocalSemantic Expressivity: LowProblem: GeneralSemantic Expressivity: MediumSemantic Expressivity: HighLogical TheoryIs Disjoint Subclass of with transitivity propertyDescription LogicDAML+OIL, OWLUMLConceptual ModelIs Subclass ofSyntactic InteroperabilityStructural InteroperabilitySemantic InteroperabilityRDF/SXTMExtended ERThesaurusHas Narrower Meaning ThanERDB Schemas, XML SchemaTaxonomyIs Sub-Classification ofRelationalModel, XMLweak semantics
13 The Smart Data Enterprise Figure 2. Developer's Perspective on Data: To the application developer, the data evolution timeline is viewed through the correlation of programming paradigms with the relation of data and code. From: Designing the Smart-Data Enterprise, Get prepared for the 10 ways that semantic computing will impact enterprise IT, by Michael C. Daconta, Posted November 28, 2003, Enterprise Architect Magazine.
14 The Smart Data Enterprise Figure 3. The Smart Data Continuum: Data has progressed through four stages of increasing intelligence. (Reprinted with permission from The Semantic Web: A Guide to the Future of XML, Web Services, and Knowledge Management [John Wiley & Sons, 2003]. From: Designing the Smart-Data Enterprise, Get prepared for the 10 ways that semantic computing will impact enterprise IT, by Michael C. Daconta, Posted November 28, 2003, Enterprise Architect Magazine.
15 Abstract The history and broader context of this work. See Section 1.The eGov Act of 2002 has two sections (207 & 212) which require more structure and interoperability for government data and information and work has begun in several committees and communities of practice.See Section 2 (just a few highlights).The new Semantic Web standards and technologies provide a way to accomplish the purposes of the eGov Act of 2002 and the FEA Data and Information Reference Model Data Management Strategy.See Section 3 (will skip over for this group).The work on repurposing the Statistical Abstract of the United States, 2003, into a DRM Registry and Repository illustrates how a number of objectives can be accomplished at the same time, including the highest priority of the CIO Council’s Architecture and Infrastructure Committee, namely “intergovernmental exchange of data and information”.See Section 4 (just a few highlights).The additional pilots underway are outlined.See Section 5.
16 Overview 1. Introduction (slides 17-19). 2. eGovernment Drivers: The eGov Act of 2002 and the FEA Data and Information Reference Model (DRM) (slides 20-32).3. Semantic Technologies for eGovernment (slides 33-49).4. Repurposing the Statistical Abstract of the United States, 2003, Into a DRM Registry and Repository (slides 50-72).5. Additional Pilots (slides 73-74).
17 1. IntroductionRepurposing of large documents with mixed content (text, tables, graphics, etc.) into XML content collections began with “The Statistical Abstract of the United States” (1999 Edition) as part of the FedStats.Net project to build a distributed network of statistical data and information using new XML standards and technology.The Statistical Abstract of the United States was considered to be one of the best examples of "manual aggregation of government information" (from some 200 programs across about 70 agencies) that would benefit from a distributed XML-based content network that would leave the content in the hands of its originators and create a more "living document".This work was recognized by OMB Associate Director for Information Technology and E-Government, Mark Forman, and the Quad Council with a Special Award for Innovation in the 2002 CIO Showcase of Excellence for the use of XML in a distributed content network (renamed FedGov) and use of VoiceXML in providing universal access to emergency response information.
18 1. IntroductionMore recently, the eGov Act of 2002's provisions for an Intergovernmental Committee on Government Information (ICGI) and Data Integration Pilots, the Federal Enterprise Architecture's Data and Information Reference Model (DRM) and its Data Management Strategy and the focus in the CIO Council's Architecture and Infrastructure Committee on Intergovernmental Data Exchange, have all be tied together in a new pilot that simultaneously accomplishes multiple objectives (see next slide).This “Smart Data Enterprise” approach came from the “Semantic Technologies for eGov Conference”, September 8, 2003, at the White House Conference Center (in which the EPA CIO and her staff participated), and continues in the new CIO Council’s Semantic Interoperability (Web Services) Community of Practice (SICoP) (see subsequent slides).
19 1. Introduction(1) Repurposes government data and information into structured documents using new XML-based standards and technologies that facilitate reuse and exchange.(2) Repurpose the data and information so that it can be readily decomposed into XML fragments (for text and tables) and RDF metadata (for graphics) that can be stored and referenced in a database and can be in turn repurposed into new documents that provide additional user-defined views of the data and information.(3) Organize and categorize the repurposed data and information using taxonomies and even ontologies in semantic registries and repositories.(4) Use "XML data islands", and RDF and OWL to add metadata, interoperability and semantic meaning to data and information to be reused and exchanged.(5) Standardize the data element and XML tag names in a DRM registry and repository.(6) Share these results with others that are working on Semantic Web and Technology Applications for eGovernment.
20 2. eGovernment Drivers The eGov Act of 2002: SEC ACCESSIBILITY, USABILITY, AND PRESERVATION OF GOVERNMENT INFORMATION.(a) PURPOSE.—The purpose of this section is to improve the methods by which Government information, including information on the Internet, is organized, preserved, and made accessible to the public.(b) DEFINITIONS.—In this section, the term—(1) ‘‘Committee’’ means the Interagency Committee on Government Information established under subsection (c); and(2) ‘‘directory’’ means a taxonomy of subjects linked to websites that—(A) organizes Government information on the Internet according to subject matter; and(B) may be created with the participation of human editors.
21 2. eGovernment Drivers The eGov Act of 2002 (continued): SEC ACCESSIBILITY, USABILITY, AND PRESERVATION OF GOVERNMENT INFORMATION.(d) CATEGORIZING OF INFORMATION.—(1) COMMITTEE FUNCTIONS.—Not later than 2 years after the date of enactment of this Act, the Committee shall submit recommendations to the Director on—(A) the adoption of standards, which are open to the maximum extent feasible, to enable the organization and categorization of Government information—(i) in a way that is searchable electronically, including by searchable identifiers; and(ii) in ways that are interoperable across agencies;(B) the definition of categories of Government information which should be classified under the standards; and(C) determining priorities and developing schedules for the initial implementation of the standards by agencies.Note: Received the 2002 CIO Council Showcase of Excellence Special Innovation Awardfor XML Web Services (VoiceXML and the FedGov Content Network) in March 2002.
22 2. eGovernment Drivers The eGov Act of 2002 (continued): SEC INTEGRATED REPORTING STUDY AND PILOT PROJECTS.(a) PURPOSES.—The purposes of this section are to—(1) enhance the interoperability of Federal information systems;(2) assist the public, including the regulated community, in electronically submitting information to agencies under Federal requirements, by reducing the burden of duplicate collection and ensuring the accuracy of submitted information; and(3) enable any person to integrate and obtain similar information held by 1 or more agencies under 1 or more Federal requirements without violating the privacy rights of an individual.
23 2. eGovernment Drivers The eGov Act of 2002 (continued): SEC INTEGRATED REPORTING STUDY AND PILOT PROJECTS.(d) PILOT PROJECTS TO ENCOURAGE INTEGRATED COLLECTION AND MANAGEMENT OF DATA AND INTEROPERABILITY OF FEDERAL INFORMATION SYSTEMS.—(1) IN GENERAL.—In order to provide input to the study under subsection (c), the Director shall designate, in consultation with agencies, a series of no more than 5 pilot projects that integrate data elements. The Director shall consult with agencies, the regulated community, public interest organizations, and the public on the implementation of the pilot projects.(2) GOALS OF PILOT PROJECTS.—(A) IN GENERAL.—Each goal described under subparagraph(B) shall be addressed by at least 1 pilot project each.(B) GOALS.—The goals under this paragraph are to—(i) reduce information collection burdens by eliminating duplicative data elements within 2 or more reporting requirements;(ii) create interoperability between or among public databases managed by 2 or more agencies using technologies and techniques that facilitate public access; and(iii) develop, or enable the development of, software to reduce errors in electronically submitted information.
24 2. eGovernment DriversThe Federal Enterprise Architecture (FEA) Data and Information Reference Model (DRM):Volume 1 – Bob Haycock, OMB Chief Architect, will soon release with guidance to the agencies.The E-Government Act 2002, Section 207, Interagency Committee on Government Information, will use top two layers of the DRM structure for categorization of government information (see next slide).The E-Government Act 2002, Section 212, calls for a series of no more than 5 pilot projects that integrate data elements to encourage integrated collection and management of data and interoperability of Federal Information systems.Data Management Strategy – In process and draft to be released soon.Have several critiques of the ISO to improve the DRM Model including the suggested use of the Meta Object Facility (MOF) from the Object Management Group (OMG) by MetaMatrix (see slide 16).Volumes 2-4 – To Be Released by Fall 2004 (see slides 17-19).DRM business context, DRM information exchange, and DRM data elements.
25 The Current DRM Model A model for discovery of information: Context and classification.To determine available packages and elements.A model for exchange of information:Information packages, built from common data elements.Sharing mechanism.A model for representation of information:Data elements defined in standard way.BUSINESS CONTEXTSubject AreaSuper TypeBUSINESS DATA FLOWInformationExchange PackageDATA ELEMENTDataObjectPropertyRepresentationISO11179
26 Expanding the DRM Model MetaMatrix ModelDRM ModelMetaMatrix vision:Generic classification to tag metadata with context:vs. 2-level context.Packages built from complex datatypes and deployable for exchange or data access:vs. exchange-only packaging of ISO data elements.Formal datatype model:vs. more conceptual ISO model.Formal reference information to add semantic value to data definitions:vs. nothing.CLASSIFICATIONBUSINESS CONTEXTContextSubject AreaCategorySuper TypePACKAGEBUSINESS DATA FLOWVirtual DatabaseExchange PackageInfo Exch PackageINSTANCEVirtualTransformPhysicalTYPEDATA ELEMENTSchema/AssociationComplex DatatypeData ObjectISO11179Abstract DatatypeData PropertySimple DatatypeData RepresentationREFERENCEGlossaryThesaurusBibliography
27 2. eGovernment Drivers VOLUME II: BUSINESS CONTEXT Data GovernanceData ArchitectureData Sharing ArchitectureVOLUME II:BUSINESS CONTEXTGovernance Structure, Policy & Procedures Purpose:Define policy & procedures for use of Information Categories in OMB 300 reporting and government information indexing.Information Categories Data Groups, and Security Profile Purpose:Catalogue and Index Government Information consistent with the E-Gov Act.Information Categories, Data Groups, and Exchange Security Requirements Purpose:Identify and define federated data classifications to discover commonalities and opportunities for re-use.
28 2. eGovernment Drivers VOLUME III: INFORMATION FLOW Data GovernanceData ArchitectureData Sharing ArchitectureVOLUME III:INFORMATION FLOWGovernance Structure, Policy & Procedures Purpose:Define and enforce policy that governs the use and protection of information packages available in the DRM registry.Information Exchange Packages, Security Profile Purpose:Define data groups (tables, records, messages, text) and attributes that reflect business process needs common to a Community of Practice.Information Maps, Exchange Security Requirements Purpose:Define data transformation patterns and key attributes that determine data exchange processing requirements.
29 2. eGovernment Drivers VOLUME IV: DATA ELEMENT DESCRIPTION Data GovernanceData ArchitectureData Sharing ArchitectureVOLUME IV:DATA ELEMENT DESCRIPTIONGovernance Structure, Policy & Procedures Purpose:Define and enforce requirements for Data Standardization.Data Element Descriptions, Security Profile Purpose:Define and maintain data structures that reflect business data entity attributes and relationships.Object Descriptions, XML Schemas, Exchange Security Requirements Purpose:Define and maintain metadata required to provide or support a specific service pattern.
30 2. eGovernment DriversThe FEA DRM Data Management Strategy, Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work, Brand Niemann and Ken Gill:Introduction to Data Semantics.Domain Data Harmonization Strategy.Data Harmonization Guiding Principles (10).Global Justice Information Sharing Initiative (Global) Example.Increased Collaboration by Means of and with "Smart Data“ (Daconta’s Declaration of Data Independence).Recommendations.Note: See for details.
31 2. eGovernment DriversThe FEA DRM has been and currently is the object of a series of pilot projects and collaborative work within the Communities of Practice:Open GIS Consortium (OGC):Information Communities and Semantics WG (ICS WG):Sustainable Intergovernmental Network Exchange (Global-Justice, Environmental Information-EPA, and Health IT Sharing (Health) (SINE):Collaborative Work Environment:Intelligence Community Metadata Working Group (IC MWG):CIO Council’s (Best Practices Committee) Knowledge Management Working Group (KM.GOV):Semantic (Web Services) Interoperability Community of Practice (SICoP):See and
32 2. eGovernment DriversThe FEA DRM has been and currently is the object of a series of pilot projects and collaborative work within the Communities of Practice (continued):E-Gov SmartServices:To join the group send an to with empty Subject and Body. You will then receive an with a web link where you can select the subscription option.Open International Forum on Business Ontology:ONTOLOG - collaborative work environment:(April 22nd presentation)Semantic Technologies for E-Government, September 8, 2003, White House Conference Center:2nd Semantic Technologies for E-Government, September 8, 2004 (tentative).University of Maryland MINDLab (Professor Jim Hendler) and TopQuadrant (Ralph Hodgson):andTopMIND Tutorials with Government Data Examples, March 22-25, 2004:
33 3. Semantic Technologies for eGovernment Web-Enabled Government 2004 Conference and Exhibition, Session 2-4, February 4th, 2004 Understanding Semantic Web Technology by Professor Jim Hendler and Brand Niemann:(1) Tree of Knowledge Technologies and The Semantic Technology “Layer Cake”(2) Where We Are(3) Emerging Vendors Landscape: Semantic Integration(4) Semantic Technologies and Web Services(5) The First Site on the Semantic Web(6) Taxonomy(7) Topic Maps(8) RDF and Ontology Components(9) RDF Syntax and Validator(10) OWL Syntax and Functionality(11) Some Educational ResourcesNote: Based on TopMIND Tutorials, November 3-4, and December 3-4, 2003
34 3. Semantic Technologies for eGovernment Jim Hendler is a Professor at the University of Maryland and the Director of Semantic Web and Agent Technology at the Maryland Information and Network Dynamics Laboratory. He holds joint appointments in the Department of Computer Science, the Institute for Advanced Computer Studies and the Institute for Systems Research, and he is also an affiliate of the Electrical Engineering Department. He has authored close to 150 technical papers in the areas of artificial intelligence, robotics, agent-based computing and high performance processing.Hendler was the recipient of a 1995 Fulbright Foundation Fellowship, is a member of the US Air Force Science Advisory Board, and is a Fellow of the American Association for Artificial Intelligence. As Chief Scientist and Program Manager at DARPA for the DAML program, he has been one of the major drivers in the creation of the Semantic Web, and continues to be a prominent player in the W3C’s Semantic Web Activity.
35 (1) Tree of Knowledge Technologies Content Management LanguagesSemantic Technology LanguagesProcess Knowledge LanguagesAI Knowledge RepresentationSoftware Modeling Languages
36 (1) The Semantic Technology “Layer Cake” Source: Dieter Fensel
37 (2) Where We Are We Are Here Source: Tim-Berner Lee, “Standards, Semantics and Survival “,
38 (3) Emerging Vendors Landscape: Semantic Integration Current Support / Primary StrengthSS&UOntopriseStructured informationOWLSS&USUnicornNetwork InferenceenLeagueUSExpressivity and Semantic PowerOntology WorksUnstructured informationRDFSSMiosoftS&USCelcorpSModulantContivoSupports bothXMLSSUSSchemaLogicIGSMetaMatrixVitriaSource: Irene Polikoff,TopQuadrant, Positioning Semantic Technologies: The Emerging VendorLandscape, September 8, 2003.Data and SchemaRun-timeIntegration andManagementValidationEngineOrchestrationEnterprise Support
39 (4) Semantic Technologies and Web Services Semantic Web ServicesEnterprise Ontology andWeb Services RegistryDynamicResourcesSemantic WebServicesWeb ServicesStaticResourcesWWWSemantic WebSource: Derived in part from two separate presentations at the Web Services One Conference 2002 by Dieter Fensel and Dragan Sretenovic.InteroperableSyntaxInteroperableSemantics
40 (5) The First Site on the Semantic Web PhotoStuff: Image Annotation Tool with OWL
41 Expect new stakeholders to take an interest… (6) TaxonomyGoals for enterprise taxonomiesRegardless of end goals, look to a future where taxonomies interoperate (domains connect)Expect new stakeholders to take an interest…… but have their own viewpointsTechnology Recommendation:RDF(S)From Tim Berners-Lee,ISWC 2003
42 (6) What is a Taxonomy?A taxonomy is a model of knowledge organized as a hierarchical arrangement (tree structure) of concepts:parent nodes denote more general ideas than their children.animalhorsesheepmarestallioneweramdales ponyarabian horseswaledalecheviotOR[A][B]
43 (6) Types of Taxonomy A taxonomy can be: A classification hierarchy, eg: Natural Taxonomy:Unique Beginner (plant) -> Life-Form (bush) -> Generic (rose) -> Specific (hybrid tea) -> Varietal (Peace)A part hierarchy (Meronomy)A category hierarchyTaxonomies can intersect – intersection means there are different relationships at work:buildingcinemaOffice-blocksynagoguemosquepubchurchshrineholy placeReference: D.A. Cruise, “Lexical Semantics”, Cambridge University Press, 1986
44 (6) topSAIL/tdf™ – Taxonomy Development Framework: A five-step method for taxonomy development 12345FocusWhat is the taxonomy for?What business challenges will it overcome?What results will it achieve?How to measure stakeholder benefit?AnalysisWhat is the context for the taxonomy?What are the types & sources of knowledge?How does knowledge map to processes?DesignWhat types of taxonomy concepts are needed?What to do first?What system capabilities are needed?What will be the impact?Is the taxonomy design correct, complete and consistent?ConstructHave we enough content mapped?How to connect taxonomies to content?How to integrate with IT systems?DeployHow do we ensure there will be feedback for assessment?Have we accomplished set objectives?What should be done next?
45 (7) Topic Maps Topic: Association: Occurrence: The TAO of Topic Maps The entry in a topic map that refers to a subject on the real world.Topic Maps make a Plato-distinction between Things in the Real World (Subjects) and Things in the Topic Map world (Topics).Association:Linkages between Topics.Tosca was written by Puccini.Occurrence:Topics “occur” in resources.Resources indicated e.g., URLsTypes of Occurrence: mention, illustration, article, etc.Note: SeeAlso see for merging of topic maps.
46 (8) RDF and Ontology Components Key Ontology ComponentsRDF* Triple Componentsdepiction*The company* **sells batteries**.ImageknowsPersonbirthdate: dateGender: charPredicate**ObjectpublishedSubject*ResourcePredicate**works forLiteralis-Aleads* Resource Description FrameworkLeaderOrganization=URI=LiteralSource: The Semantic Web: A Guide to the Future of XML, Web Services,and Knowledge Management, Wiley Technology Publishing, June 2003.=Property orAssociation
47 (9) RDF Syntax and Validator Graph of the Data Model<?xml version="1.0"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://www.example.org#"><rdf:Description rdf:ID="Jen"><rdf:type><rdf:Description rdf:about="Person"></rdf:type><has name>Jen Golbeck</has name><hasJob><rdf:Description rdf:about="Job1"><employer>George Washington University</employer><position>Adjunct Professor</position><hired>July 2001</hired><salary>$1</salary><hoursPerWeek>15</hoursPerWeek></rdf:Description></hasJob>………………..</Person></rdf:Description rdf></rdf:RDF>
48 (10) OWL Syntax and Functionality <?xml version="1.0"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"xmlns:owl="http://www.w3.org/2002/07/owl#"xmlns="http://www.example.org#"xml:base="http://www.example.org/"><owl:Class rdf:ID="Person"/><owl:Class rdf:ID="Employee"><rdfs:subClassOf rdf:resource="#Person"/><rdfs:label>Our Cool Employee Class</rdfs:label></owl:Class><owl:Class rdf:ID="Civil_Servant"><rdfs:subClassOf rdf:resource="#Employee"/><rdfs:label>Our Cool Civil Servant Class</rdfs:label><owl:Class rdf:ID="Woman"><rdfs:label>Our Cool Woman Class</rdfs:label><owl:Class rdf:ID="Man"><rdfs:label>Our Cool Man Class</rdfs:label>…….Applications for OWL:Markup for web pages and other web-based media.Raw Data Sharing.Web Services.Media Markup:Google and other keyword searches are excellent because they can work with text.Not likely to be much improved by semantic web.Image searches are much worse than text searches.No way to know what is happening in an image, what in it, what context it was taken, or who is doing what.MP3 searches.I want that song that was in the Mitsubishi commercial…Video search.Challenges:Trust & Provenance.Visualization.
49 (11) Some Educational Resources Johan Hjelm, “Creating the Semantic Web with RDF”, John Wiley, 2001Dieter Fensel: “Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce”, Springer Verlag, 2001John Davies, Dieter Fensel & Frank van Harmelen:, “Towards the Semantic WEB – Ontology Driven Knowledge Management”, John Wiley, 2002Dieter Fensel, Wolfgang Wahlster, Henry Lieberman, James Hendler (Eds.): “Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential”, MIT Press, 2002Michael C. Daconta, Leo J. Obrst, Kevin T. Smith: “The Semantic Web: A Guide to the Future of XML, Web Services, and Knowledge Management”, John Wiley, 2003Vladimir Geroimenko (Editor), Chaomei Chen (Editor), “Visualizing the Semantic Web”, Springer-Verlag, 2003M. Klein and B. Omelayenko (eds.), “Knowledge Transformation for the Semantic Web”, Vol. 95, Frontiers in Artificial Intelligence and Applications, IOS Press, 2003Sheller Powers, “Practical RDF”, O’Reilly, 2003
50 Steps in Repurposing the Data Tables: 4. Repurposing the Statistical Abstract of the United States, 2003, Into a DRM Registry and RepositoryOverviewSteps in Repurposing the Data Tables:(1) Table in Adobe Reader 6.0.(2) Define Basic XML Tags in XMLSPY 2004.(3) Define XML Tags for Data Element Names in XMLSPY 2004.(4) Markup the Table in XMLSPY 2004.(5) Grid View in XMLSPY 2004.(6) XML Table Database in Excel 2002.(7) Create the HTML Interface.(8) HTML Interface in Browser.(9) XML Table Database in Browser.Some Features of the DRM Registry and Repository:Note that it is embedded in the document itself, not separate!
51 4. Repurposing the Statistical Abstract of the United States, 2003, Into a DRM Registry and RepositoryOverview:The methodology for repurposing the Statistical Abstract, 2003, documents (45 PDF files/14.2 MB) into a structured XML content collection was presented previously:See “Past Meetings and Presentations” at November 18-19, 2003, Website Content Management for Government Conference, Invited Presentation on November 19th on "Repurposing Documents Into Semantic Web Services and Networks" (EPA Enterprise Integration Portal/Data Exchange Network Pilot), Doubletree Hotel, Arlington, VA. Also see Folio-to-XML Conversion and Webinar.Current plans call for the completions of the repurposing of this document and continued work on state of the environment and national and community indicator reports.
52 Step 1. Table in Adobe Reader 6.0 Text Select Tool & Highlight Table, Edit & Copy, & Edit & Paste to XML SPY 2004
53 Step 2. Define Basic XML Tags in XMLSPY 2004 <TableTitle><TableHeadNote><TableBody><TableFootnote><TableSource>
54 Step 3. Define XML Tags for Data Element Names in XMLSPY 2004 Census Date (Year, Month & Day)Resident Population (Number)Resident Population (Number Per Square Mile of Land Area)Resident Population Increase Over Preceding Census (Number)Resident Population Increase Over Preceding Census (Percent)Area (Square Miles) TotalArea (Square Miles) LandArea (SquareMiles) WaterCensusDateYearMonthDayResidentPopulationNumberResidentPopulationPerSquareMileofLandAreaResidentPopulationIncreaseOverPrecedingCensusNumber ResidentPopulationIncreaseOverPrecedingCensusPercentAreaSquareMilesTotalAreaSquareMilesLandAreaSquareMilesWaterThe “heart” of the DRM Registry and Repository for reuse!
55 Step 4. Markup the Table in XMLSPY 2004 Text View in XMLSPY 2004
56 Step 4. Markup the Table in XMLSPY 2004 (continued) Text View in XMLSPY 2004
57 Step 5. Grid View in XMLSPY 2004 (like a spreadsheet!) Highlight Grid Table, Edit & Copy as Structured Text, and Paste to Excel.
58 Step 6. XML Table Database in Excel 2002 Highlight Table, Format & Column & AutoFit Selection. Alsospreadsheet-like data tables can be pasted into XMLSPY 2004.
59 Step 7. Create the HTML Interface Navigation Functionality(non-XML)Note two references to statabs2003no1.xml.
60 Step 7. Create the HTML Interface (continued) Data Element NamesXML Tag NamesNote this makes the XML table database independent of the HTML presentation.
61 Step 8. HTML Interface in Browser Link to XML FileNavigation ButtonsCan easy browse through long tables.
62 Step 9. XML Table Database in Browser Can expand and collapse using + and -.The “heart” of the DRM Registry and Repository for interoperable exchange.
63 Some Features of the DRM Registry and Repository Taxonomy of Federal Statistical Data and Information!
64 Some Features of the DRM Registry and Repository Detailed of Table of Contents for Entire Document.
65 Some Features of the DRM Registry and Repository Detailed Table of Contents for Each Section.
66 Some Features of the DRM Registry and Repository Graphics can have RDF metadata.
67 Some Features of the DRM Registry and Repository Tables are structured data (copy to Excel) and available in XML
68 Some Features of the DRM Registry and Repository Table copied to Excel from Browser
69 Some Features of the DRM Registry and Repository Search within just one chapter of the entire document.
70 Some Features of the DRM Registry and Repository Better search than from conventional Internet search engines.
71 Some Features of the DRM Registry and Repository Appendix III on Limitations of the Data (Data Quality) for Major Databases!
72 Some Features of the DRM Registry and Repository Harmonization/Standardization of Data Element and XML Tag Names
73 5. Additional PilotsWhere does the FEA go next?, Bob Haycock, Chief Architect, OMB, at the Chief Architects Forum, April 5, 2004:Complete the DRM.Conduct DRM Community of Practice Pilots.Continue to develop and implement further DRM volumes and FEA Data Management Strategy.Etc.
74 5. Additional PilotsCensus Bureau/FedStats (Statistical Abstract of the US):Lead original Line of Business (Data and Statistics) which was exempted so it became a logical selection for a “best practice” pilot!National Indicator System and the Community Statistical System:GAO, CEQ, Community Indicator Consortium, etc.Sustainable Intergovernmental Network Exchange (SINE):Global Justice, EPA, Health, etc.Intelligence Community Metadata Working Group (IC MWG):XML Enablement Strategy and Tool Evaluation.Componenttechnology.Org:Proposals from participants in this Community of Practice to “Populate the Service Grid with Services Components”.Categorization of Government Information Working Group of the Interagency Committee on Government Information:GSA Office of Intergovernmental Solutions (Susan Turnbull) Outreach to Involve State and Local Governments.University of Maryland MINDLab (Professor Jim Hendler) and TopQuadrant (Ralph Hodgson):Semantic Markup and Tools for Government Content (getting content ready for them!).