Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Developing an Ontolog Ontology Denise A. D. Bedford April 13, 2006.

Similar presentations


Presentation on theme: "1 Developing an Ontolog Ontology Denise A. D. Bedford April 13, 2006."— Presentation transcript:

1 1 Developing an Ontolog Ontology Denise A. D. Bedford April 13, 2006

2 2 Presentation Goals Primary Purpose of presentation today is to: Primary Purpose of presentation today is to: Establish a framework for developing an ontology that will focus on the current and future content of the Ontolog community, support a range of uses of the Ontolog and Ontolog-referenced content, by Ontolog members and non-members Establish a framework for developing an ontology that will focus on the current and future content of the Ontolog community, support a range of uses of the Ontolog and Ontolog-referenced content, by Ontolog members and non-members Provide a sustainable foundation for future variations in content, use and users - which is extensible without radical re-engineering going forward Provide a sustainable foundation for future variations in content, use and users - which is extensible without radical re-engineering going forward Provide a framework against which a basic set of functional architecture requirements can be defined – June discussion Provide a framework against which a basic set of functional architecture requirements can be defined – June discussion Provide a framework against which various semantic technologies might be positioned to support Ontolog - April and June discussions Provide a framework against which various semantic technologies might be positioned to support Ontolog - April and June discussions

3 3 Presentation Goals Secondary Purpose of presentation today is to: Secondary Purpose of presentation today is to: Provide a basis for a case study in collaborative practice domain ontology development and management Provide a basis for a case study in collaborative practice domain ontology development and management Provide a comparison – along the way – of the various ontology reference models Provide a comparison – along the way – of the various ontology reference models If the group wishes – along the way – provide the community with guidance in positioning semantic solutions vis a vis semantic problems If the group wishes – along the way – provide the community with guidance in positioning semantic solutions vis a vis semantic problems

4 4 Goal is not to… Advocate one particular semantic approach over others because they all serve different purposes Advocate one particular semantic approach over others because they all serve different purposes Provide a survey of or evaluate the individual technologies on the market today Provide a survey of or evaluate the individual technologies on the market today Suggest that any one person has a solution that works for everyone Suggest that any one person has a solution that works for everyone Rather, to discuss a strategy or approach for addressing the problem Rather, to discuss a strategy or approach for addressing the problem

5 5 Some Basic Questions How can we anchor our ontology? Ie. Where do we start? How can we anchor our ontology? Ie. Where do we start? How do we know if we need one ontology or many? How do we know if we need one ontology or many? How do we know if we need to create one or if we can borrow/adapt one from someone else? How do we know if we need to create one or if we can borrow/adapt one from someone else? Let’s take as a starting point, a framework with three essential components that need to be addressed by any ontology we define: Let’s take as a starting point, a framework with three essential components that need to be addressed by any ontology we define: Content Content Users Users Use/processes Use/processes These basic reference points should give us sufficient scenarios to understand the basic functional requirements our ontology will have to satisfy These basic reference points should give us sufficient scenarios to understand the basic functional requirements our ontology will have to satisfy

6 6 The Context for an Ontolog Ontology Users Use or Function Information (Document) Context

7 7 Users May seem like the easiest dimension to address – but we need to make sure we have the same goals for the Ontolog ontology May seem like the easiest dimension to address – but we need to make sure we have the same goals for the Ontolog ontology Do we assume that only Ontolog active members will be served by the ontology? Do we assume that only Ontolog active members will be served by the ontology? Or, do we support all members and the general public who might be interested in joining the community or who might find the wiki content a valuable resource for learning? Or, do we support all members and the general public who might be interested in joining the community or who might find the wiki content a valuable resource for learning? Are we assuming only ontolog-sophisticates or do we include general managers, novices, general public interest? Are we assuming only ontolog-sophisticates or do we include general managers, novices, general public interest?

8 8 User Community Who Domain Knowledge Roles Ontolog Member Wiki Wiki Manager Ontolog Member/Non- Member Ontology research & development Researchers, discussants, presenters, novices Ontolog Member/Non- Member Computational linguistics Researchers, discussants, presenters, novices Ontolog Member/Non- Member Standards development work Participants, vendors, observers, implementors Ontolog Member/Non- Member Metadata Creators, users, semantics developers, computational linguists Ontolog Member/Non- Member Taxonomies Creators, designers, users, semantics developers, computational linguists Ontolog Member/Non- Member Information Architecture Engineers, information scientists Ontolog Member/Non- Member Semantic Technologies Developers, users, implementors, linguists, novices

9 9 Use and Context It is challenging for people who are so familiar with ontology development and semantic technologies to step back and think about how an ontology would actually support our use of the Ontolog content It is challenging for people who are so familiar with ontology development and semantic technologies to step back and think about how an ontology would actually support our use of the Ontolog content But, this is a critical first step – without understanding the use and context, we cannot establish a baseline ontology But, this is a critical first step – without understanding the use and context, we cannot establish a baseline ontology Without understanding use and context we will forever argue about which model works best, which tools work best and who should do what – actually, there is room for variation and negotiation here Without understanding use and context we will forever argue about which model works best, which tools work best and who should do what – actually, there is room for variation and negotiation here Following tables are the result of some brainstorming and observations from the Ontolog community itself Following tables are the result of some brainstorming and observations from the Ontolog community itself

10 10 Possible Uses of Ontolog Content DoingWhat Find Person who knows something about an issue Browse Issues that Ontolog has discussed Find All people who participated in a discussion Learn About Reference models discussed by Ontolog Get list of Problems Ontolog identified that need attention Browse Collections by topic Search Future conference call topics

11 11 Possible Uses of Ontolog Content DoingWhat Search Next scheduled call Search Specific email message Find List of all members of Ontolog Find Specific Ontolog member Find Reference to ontology standards Find Book references Find Organizations working in this area

12 12 Possible Uses of Ontolog Content DoingWhat Find Upcoming conferences & participants Generate Knowledge map of who knows what in Ontologies Generate Map of the social networking in Ontolog Publish review of a new book Start Discussion of a new topic Annotate/sujmarize Discussion thread Others??Others??

13 13Content Let’s do a simple exercise of defining the kinds of content that the ontology has to cover - this may seem like the easiest component to define, although a lot of the content is not as obvious as we might think Let’s do a simple exercise of defining the kinds of content that the ontology has to cover - this may seem like the easiest component to define, although a lot of the content is not as obvious as we might think When we began our work with semantic technologies at the World Bank five years ago we started from a content model perspective – all of our content types have data models (see next slide) – difference between ‘concepts’ and ‘instances’ When we began our work with semantic technologies at the World Bank five years ago we started from a content model perspective – all of our content types have data models (see next slide) – difference between ‘concepts’ and ‘instances’ The content models, combined with the nature of their use and the type of user, helped us to identify the kinds of semantic problems we would encounter The content models, combined with the nature of their use and the type of user, helped us to identify the kinds of semantic problems we would encounter You can then evaluate semantic technologies vis a vis your semantic problems – without this analysis, you may end up creating a situation you cannot manage or sustain You can then evaluate semantic technologies vis a vis your semantic problems – without this analysis, you may end up creating a situation you cannot manage or sustain Let’s extract the set of content objects from the previous tables -- then let’s see what else people expect from the Ontolog community wiki Let’s extract the set of content objects from the previous tables -- then let’s see what else people expect from the Ontolog community wiki

14 14 Content Data Model Example – Event, Communique

15 15 First Cut at Ontolog Content Ontolog People profiles/pages Ontolog People profiles/pages Ontolog presentations Ontolog presentations Ontolog discussion threads Ontolog discussion threads Ontolog concepts Ontolog concepts Ontolog Activity Calendar Ontolog Activity Calendar Ontolog Conference call notes Ontolog Conference call notes Ontolog Conference call agendas Ontolog Conference call agendas Ontolog Conference call minutes Ontolog Conference call minutes Ontolog Conference call transcripts Ontolog Conference call transcripts Email messages Email messages Discussion threads/forums Discussion threads/forums Wiki search logs Wiki search logs Professional Conference schedules & announcements Professional Conference schedules & announcements Professional Conference representation Professional Conference representation Books on ontology topics Books on ontology topics Published articles on ontology topics Published articles on ontology topics Reviews of books on ontologies Reviews of books on ontologies Ontology standards Ontology standards Professional organizations Professional organizations Research institutions Research institutions

16 16 Content Entity Definition Content Elements Content Metadata Profile Ontolog Topic Class Scheme Authority Control – Member Names Thesaurus of Ontolog Concepts Areas of Expertise Authority Contro – Organizations Has values uses Has Contains User Has relationship to Has Meaning in Use Contextual Matrix & Sensiing Understood in uses Profile Has Business Rule Has Ontology Architecture Begins to Emerge Has values Content Elements Has Content Model Has Aggregation Levels

17 17 Functional Requirements Begin to Emerge We begin to see how all of the components of the semantic architecture fit together…. Metadata schema Metadata schema Different kinds of taxonomies (controlled lists, rings, hierarchies, concept networks) Different kinds of taxonomies (controlled lists, rings, hierarchies, concept networks) Semantic analysis tools to support metadata capture Semantic analysis tools to support metadata capture Metadata encoding options (xml, rdf, etc.) Metadata encoding options (xml, rdf, etc.) Metadata storage options (e.g. embedded in document, distinct database, etc.) Metadata storage options (e.g. embedded in document, distinct database, etc.) Search system which supports attribute searching & which leverages reference sources Search system which supports attribute searching & which leverages reference sources Browse structure Browse structure Reporting Reporting Data mining and clustering Data mining and clustering Other more sophisticated inference and reasoning options (shall we try to discover or test some standard axioms for ontologies?) Other more sophisticated inference and reasoning options (shall we try to discover or test some standard axioms for ontologies?)

18 18 Metadata Schema and Taxonomies Schema needs to cover all kinds of content we’ve identified Schema needs to cover all kinds of content we’ve identified We need to identify at least the basic attributes of the content – keep it simple and purposeful – and targeted to use and users We need to identify at least the basic attributes of the content – keep it simple and purposeful – and targeted to use and users Discuss which attributes need to be managed and which not managed? Discuss which attributes need to be managed and which not managed? Keep the horse in front of the cart -- what needs to be managed should be analyzed in terms of its data structures, syntax and semantics before we can specify the type of ontology that is needed Keep the horse in front of the cart -- what needs to be managed should be analyzed in terms of its data structures, syntax and semantics before we can specify the type of ontology that is needed

19 19 Faceted taxonomy at center – other types as controlling sources – distinct ontologies Concept networks Ontolog Topics Names Any one value might have many synonyms (ring)

20 20 Ontolog Metadata Coverage & Strategies Attribute Semantic Challenge Solution People Names, institution names, organization names Variations Harmonization through concept extraction Ontolog Topics Distill the topics of interest, maintain Automated Categorization Concepts Breadth of coverage, variations Concept Extraction, Harmonization through clustering People skills & competencies Distill a list – maintain Concept extraction, harmonization through categorization Domain knowledge Distill the list of domains, map to topics Categorization, harmonization

21 21 Semantic Technologies Most primitive level of semantic discovery and harmonization is human brain and language Most primitive level of semantic discovery and harmonization is human brain and language But a human approach to metadata – based on our experience is neither scalable nor practical – it can help you to discover what your reference sources are but it won’t sustain for tagging’ But a human approach to metadata – based on our experience is neither scalable nor practical – it can help you to discover what your reference sources are but it won’t sustain for tagging’ Cleanup, disconnects, amount of technical resources needed to compensate for unmanaged ‘human semantics’ can be costly and resource intense to support Cleanup, disconnects, amount of technical resources needed to compensate for unmanaged ‘human semantics’ can be costly and resource intense to support Rather, leverage the human semantics to inform the semantics – not the other way around Rather, leverage the human semantics to inform the semantics – not the other way around Question then is how to leverage the semantic tools to support the ontology? Question then is how to leverage the semantic tools to support the ontology? Where do the tools fit? What functions do they support? Where do the tools fit? What functions do they support? What resources are needed to sustain them? What resources are needed to sustain them?

22 22 Categorizing Content – Real World Example World Bank adopted an automated solution for ‘tagging’ content – all kinds of content – which is now operational in systems World Bank adopted an automated solution for ‘tagging’ content – all kinds of content – which is now operational in systems Let’s take as examples selected attributes and illustrate how we’re categorizing our content to this structure automatically Let’s take as examples selected attributes and illustrate how we’re categorizing our content to this structure automatically Topic classification, geographical region assignment, keywording examples Topic classification, geographical region assignment, keywording examples This approach can be applied to any kind of content – as long as you have some electronic content to work with (electronic information about or from people can be used to generate people profiles) This approach can be applied to any kind of content – as long as you have some electronic content to work with (electronic information about or from people can be used to generate people profiles) Enables us to build a robust metadata repository model, with strong metadata quality, to move towards SI at the functional level Enables us to build a robust metadata repository model, with strong metadata quality, to move towards SI at the functional level Also note that we can do this across many languages Also note that we can do this across many languages

23 23 Sidebar -- What is Teragram? Semantic analysis tools which support concept extraction, categorization, summarization and pattern matching rules engines Semantic analysis tools which support concept extraction, categorization, summarization and pattern matching rules engines Teragram works in 23 languages Teragram works in 23 languages Use categorization to capture Topics, Business Activities, Regions, Sectors, Themes, etc. Use categorization to capture Topics, Business Activities, Regions, Sectors, Themes, etc. Use Concept Extraction to capture keywords Use Concept Extraction to capture keywords Use Rules Engine to capture Loan #, Credit #, Project ID, Trust Fund #, etc. Use Rules Engine to capture Loan #, Credit #, Project ID, Trust Fund #, etc. Use Summarization to generate a ‘gist’ of the content Use Summarization to generate a ‘gist’ of the content

24 24 Use of Semantic Technologies - Example Sample structure –Topics Classification Scheme (hierarchical taxonomy) Sample structure –Topics Classification Scheme (hierarchical taxonomy) Oracle data classes used to represent Topic Classification scheme Oracle data classes used to represent Topic Classification scheme hierarchical taxonomy as reference source for the attribute – Topic hierarchical taxonomy as reference source for the attribute – Topic used for Browse, Search, Content Syndication, Personalization used for Browse, Search, Content Syndication, Personalization 1 st challenge is to architect the hierarchy correctly 1 st challenge is to architect the hierarchy correctly 3 distinct data classes, not a tree structure with inheritance 3 distinct data classes, not a tree structure with inheritance Allows you to use the three data classes for distinct functions across systems but still enforce relationships across the classes Allows you to use the three data classes for distinct functions across systems but still enforce relationships across the classes

25 25 3 Oracle Data classes

26 26 Relationships across data classes

27 27 Subtopics Domain concepts or controlled vocabulary

28 28 Extensive operators allow us to write grammatical rules to manage typical semantic problems

29 29 Concept based rules engine allows us to define patterns to capture other kinds of data

30 30 Example of use of Authority Control to capture country names but extract ‘authorized’ version of country name Example of use of a gazetteer + concept extraction + rules engine to support semantic interoperability

31 31 Use of concept extraction + rules engine to capture Loan #, Credit #, Project ID#

32 32 Caution Regarding Tools Not all tools will do what we describing here Not all tools will do what we describing here You need to have an underlying semantic engine which can perform semantic analysis – Bayesian/statistical data mining approaches will not work in this way You need to have an underlying semantic engine which can perform semantic analysis – Bayesian/statistical data mining approaches will not work in this way You need to have a semantic engine in multiple languages – semantics vary by language You need to have a semantic engine in multiple languages – semantics vary by language You need to have access to the programs through a user- friendly interface so you can adapt them to your environment without having to have programming knowledge You need to have access to the programs through a user- friendly interface so you can adapt them to your environment without having to have programming knowledge You need to have several different kinds of technologies to do what I’m describing here You need to have several different kinds of technologies to do what I’m describing here Not all the tools on the market today support this work Not all the tools on the market today support this work

33 33 How does semantic analysis work?

34 34 Semantic Analysis Basics Once you have made some sense of the sentence, reconstruct entities for information extraction (compose) Once you have made some sense of the sentence, reconstruct entities for information extraction (compose) Identify names and other fixed form expressions – people, organizations, conferences Identify names and other fixed form expressions – people, organizations, conferences Identify basic noun groups, verb groups, presentations, other grammatical elements Identify basic noun groups, verb groups, presentations, other grammatical elements Use exposed grammars to construct rules for targeted entity extraction - noun groups and verb groups Use exposed grammars to construct rules for targeted entity extraction - noun groups and verb groups Identify event structures Identify event structures Identify common elements and associate Identify common elements and associate

35 35 Enterprise Profile Development & Maintenance Enterprise Metadata Profile Concept Extraction Technology Country Organization Name People Name Series Name/Collection Title Author/Creator Title Publisher Standard Statistical Variable Version/Edition Categorization Technology Topic Categorization Business Function Categorization Region Categorization Sector Categorization Theme Categorization Rule-Based Capture Project ID Trust Fund # Loan # Credit # Series # Publication Date Language Summarization e-CDS Reference Sources for Country, Region, Topics Business Function, Keywords, Project ID, People, Organization Data Governance Process for Topics, Business Function, Country, Region, Keywords, People, Organizations, Project ID Teragram Team TK240 Client ISP IRISImageBank Factiva JOLIS E-Journals Enterprise Profile Creation and Maintenance UCM Service Requests Update & Change Requests

36 36 Next Steps - Discussion Purpose of this presentation is to try to frame the discussion for the Ontolog community going forward Purpose of this presentation is to try to frame the discussion for the Ontolog community going forward Next week we will have a panel of speakers who will talk about aspects of the challenge of developing and applying an ontology for the Ontolog content Next week we will have a panel of speakers who will talk about aspects of the challenge of developing and applying an ontology for the Ontolog content In the time that we have remaining today, might we discuss what other issues need to be added to the framework? In the time that we have remaining today, might we discuss what other issues need to be added to the framework?

37 37 Thank You!


Download ppt "1 Developing an Ontolog Ontology Denise A. D. Bedford April 13, 2006."

Similar presentations


Ads by Google