Growing the Semantic Web By Charla Woodbury June 11, 2004.

Slides:



Advertisements
Similar presentations
STORAGE AND RETRIEVAL OF INFORMATION
Advertisements

We have developed CV easy management (CVem) a fast and effective fully automated software solution for effective and rapid management of all personnel.
Business Development Suit Presented by Thomas Mathews.
Copyright © 2014 Pearson Education, Inc. Publishing as Prentice Hall
Linked Library Data Miiya Holmes October 6-7, 2012.
Cloud Computing COMP 1631, Winter 2011 Yanggang Chen.
Online Collaboration Applications ADE100- Computer Literacy Lecture 28.
Optimizing Windows There are several ways to optimize (perform regular maintenance) Windows to keep it performing smoothly and quickly. Most of these discussed.
Introducing new web content management tools for Priority...
ViewTrip White Label Sales Presentation. What is ViewTrip White Label? >ViewTrip White Label is a version of our web based ViewTrip product that enables.
Click to edit Master subtitle style JISC XYZ Project Principal Investigator: Peter Murray-Rust Project Team: Nick England, Brian Brooks Unilever Centre,
Presented By: Katie, Jake, Janet, Marcellous, and Junaid.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
1 of 6 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2007 Microsoft Corporation.
1 of 5 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2006 Microsoft Corporation.
Chapter 7 Indexing Objectives: To get familiar with: Indexing
Presented by Mina Haratiannezhadi 1.  publishing, editing and modifying content  maintenance  central interface  manage workflows 2.
Introducing... Powered by. World Class Solution Built on the same platform that is used by leading SMEs, digital agencies, universities and Fortune.
Internet Research Finding Free and Fee-based Obituaries Online.
I find WordPress limiting for larger sites, Does anybody know a good theme for a travel site? There are some great plugins available for PowerPoint… >
July 29 and August 11, 2015 How CONTENTdm works: A demonstration Ron Gardner OCLC Digital Services Consultant.
Research Methods & Data AD140Brendan Rapple 2 March, 2005.
E Marketing E Newsletter and E-Surveys Are They For You???
Internet Research Public Records & Exercises. Public Records In a democracy, government and its officials work for the people, the public. The records.
Create a Website on the CWU network Find “How to Post a Web Page with a PC”
4 OFFICE WEEKLY MEETING Why 4 Office DMS?. Challenge Companies today are overwhelmed with information that comes to them on many formats: , electronic.
A Guide to the BIZNET Online Filing System STATE OF CONNECTICUT DEPARTMENT OF CHILDREN & FAMILIES (DCF) DEPARTMENT OF DEVELOPMENTAL SERVICES (DDS) DEPARTMENT.
Systems Development Life Cycle Dirt Sport Custom.
Overview: Humans are unique creatures. Everything we do is slightly different from everyone else. Even though many times these differences are so minute.
Advanced Excel for Finance Professionals A self study material from South Asian Management Technologies Foundation.
HTML.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Using sources in your Advanced Higher Investigation.
So – You want to learn how to put an article onto the state website. (Note: If you have not done so, you will need to review the web training provided.
Plan Design Analyze Develop Test Implement Maintain Systems Development Life Cycle MCC Designs Meghan Perea Carrie Ver Burg Cory Schroeder.
1 State Records Center Entering New Inventory  Versatile web address:  Look for any new ‘Special Updates’ each.
VoiceThread:. With VoiceThread, group conversations are collected and shared in one place from anywhere in the world. All with no software to install.
Web Page Design I Basic Computer Terms “How the Internet & the World Wide Web (www) Works”
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
The INTERNET how it works. the internet: defined So, what is it?
POPULATION AND HOUSING CENSUSES IN SLOVAKIA ON THE WEBSITE Miroslav Hudec Pavol Büchler INFOSTAT – Bratislava MSIS Geneva
SharePoint document libraries I: Introduction to sharing files Sharjah Higher Colleges of Technology presents:
© TIAC group, IPA Information System [case study] Vojvodina Investment Promotion Fund.
Metadata Extraction for NASA Collection June 21, 2007 Kurt Maly, Steve Zeil, Mohammad Zubair {maly, zeil,
Databases. What is a database?  A database is used to store data. The word DATA is actually Latin for FACTS. A database is, therefore, a place, or thing.
Getting Started Managing a Collaboration Site Kendra Holly SharePoint Analyst June 13, 2015.
LRC Wiki Qin Wei /Home.
OWL Representing Information Using the Web Ontology Language.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
1 Multi-Level Indexing and B-Trees. 2 Statement of the Problem When indexes grow too large they have to be stored on secondary storage. However, there.
M1G Introduction to Programming 2 3. Creating Classes: Room and Item.
Word Processing Word processing packages such as Microsoft Word are text based. When text is entered via a keyboard, the characters are displayed on screen.
Website Design, Development and Maintenance ONLY TAKE DOWN NOTES ON INDICATED SLIDES.
+ Publishing Your First Post USING WORDPRESS. + A CMS (content management system) is an application that allows you to publish, edit, modify, organize,
SharePoint document libraries I: Introduction to sharing files Why document libraries? Sharing files with others is essential to getting things done nowadays.
R&D Operation Best Practice for Start Up Start a Business And Change the world Alfred Boediman, Ph.D.
Electronic Business: Concept and Applications Department of Electrical Engineering Gadjah Mada University.
TechKnowlogy Conference August 2, 2011 Using GoogleDocs for Collaboration.
EURISOL, PSI, June 2006E.Wildner, CERN1 Data Bases for Parameter Lists N. Emelianenko, CERN AT-MAS E. Wildner, CERN AT-MAS Presentation is based on a presentation.
Fox Scientific, Inc. ONLINE ORDERING 101. Welcome to our website On our main page you can find current promotions, the vendors we offer, technical references.
1 Requirements Management - II Lecture # Recap of Last Lecture We talked about requirements management and why is it necessary to manage requirements.
Pre-Production Meet with the client to create a project plan:
How to get started with RefWorks
Lesson 16 Enhancing Documents
How to get started with RefWorks
SharePoint Site Admin Training
Your Basement Is More Than Just Storage
Technology I Mrs. Huddleston
Website Testing Checklist
Exploring Web Page Design
Presentation transcript:

Growing the Semantic Web By Charla Woodbury June 11, 2004

INTERNET to SEMANTIC WEB  The present internet is too large to conduct specific searches in its present format  The Semantic Web holds the promise of a much richer and easily searchable information resource  Most current research targets small areas of development of the Semantic Web rather than looking at the whole process and showing its advantages  What is needed is a working example of the Semantic Web that demonstrates the advantages and minimizes the problems to be able to start growing webpages for the Semantic Web

High-volume Information Publishers should be the first TARGET  The old adage is to deal with the new water coming in rather than changing the water already in the lake if you want to change the lake’s water in any way  By starting with high-volume information publishers, the nature of the internet lake would change very quickly

Embedded Obituary Ontology Obituary Prototype Newspaper Publisher Obituary vocabulary Word Net Daily News obituaries Daily News HOME PAGE Obituary vocabulary

Once the faucet is turned on the population pool of Semantic Webpages would grow very quickly

Thesis Statement The cost/benefit analysis of populating the Semantic Web by building an embedded OWL ontology and the corresponding specialized vocabulary on top of WordNet for EACH information publisher using an obituary prototype is practical and cost effective.

ADVANTAGES  Each information publisher  The ontology is only built once and used many times  The specialized vocabulary is only built once and accessed many times  The ontology and vocabulary belong to the publisher who can change them as the format and vocabulary of the obituaries they produce change (deletion discouraged)  Most of the cost would be incurred in setting up the ontology and the specialized vocabulary

ADVANTAGES  Information extraction would be done without contacting the publisher other than an agent  There would be no need to index the information once the information retrieval portion was in place  HTML information is easy to store and maintain  HTML files are much smaller than digitized microfilm presently used

METHODS Each Newspaper  Contact selected newspapers to produce semantic obituary webpages  Learn how they archive the HTML version of the newspaper  Get estimates on the cost to the newspaper to index, microfilm, and store their archives  Request a reporter in obituaries to list specialized vocabulary and build the vocabulary and OWL ontology to be embedded  Train a newspaper employee to test and edit the ontology and vocabulary  Test that vocabulary and ontology to make sure that it is sufficiently inclusive  Compare the time needed to build the first newspaper with the subsequent ones

METHODS Organizations using Obituary information  Contact Family History businesses, Genealogical societies, and Government agencies that would use obituary information  Find out how they get their obituary information now and how much that costs in time and money  Measure their future interest in using agents to retrieve obituary information instead  Discover what parts of the obituary information they consider minimal to their work and what information would be desired and optimal  Present the results of obituary prototype and re-measure their future interest in using agents to retrieve obituary informaiton

PROBLEMS  The first problem is how to entice publishers to start the process  The basic problem is a semantic one? How will regional burial practices and language differences impact the process?  But the biggest problem is how to maintain the ontology and vocabulary with the least amount of human intervention

First Problem How to entice publishers to start the process of making semantic webpages?  Find Grants, Research Money, and/or money from Corporate sponsorship by those companies that would profit from the information  Petition for Government Support  Office of Internet Semantic Information (i.e. Library of Congress)  Demonstrate by prototype - Obituaries  Process works well (Electric lights in large cities)  Specific information is far more easily found  Their information is more available  The maintenance process is minimal  The rewards are maximal  Everyone else is doing it

SECOND PROBLEM The basic problem is a semantic one? How will regional burial practices and language differences impact the process?  The basic format of the specialized vocabulary would be the same as WordNet with rich word relationships (i.e. interred – interment – buried – burial as homonyms)  Regional and language differences would be expressed in adding rich vocabulary as deemed necessary by the individual publisher  Fine-tune and test the vocabulary and the ontology  Teach the computer to speak obituary language

THIRD PROBLEM How to simplify and automate the testing and maintenance of the ontology and vocabulary?  TESTING and SIMPLE MAINTENANCE  Install a tool for creating and editing an OWL ontology as automated as possible  Set up procedures for how often to test the ontology (i.e. new reporter, new obituary template, a set length of time)  Write program that tests how effective the ontology is and lists words in the obituaries that are not in the vocabulary for review and addition to the vocabulary  Teach the machine to add those words automatically to the vocabulary if possible

Evaluation  Cost/benefit analysis in time and money between the original process and the new Semantic Web process  Survey those testing and maintaining the Semantic Webpages about the process and the tools provided  Compare Survey given to possible information retrievers before and after demonstration of the obituary prototype

CONTRIBUTIONS  A working model of the Semantic Web  A growing pool of semantic webpages for future information extraction & retrieval  As new standards emerge, adjustments in the process could be made immediately and only once for everyone  A replacement for the cost of human indexing the information

Future Work How will agents interpret many different obituary ontologies and vocabularies? Newspaper Publisher Newspaper Publisher Newspaper Publisher Newspaper Publisher Newspaper Publishers Embedded Obituary Ontology Daily News obituaries Embedded Obituary Ontology Daily News obituaries Embedded Obituary Ontology Daily News obituaries Embedded Obituary Ontology Daily News obituaries Embedded Obituary Ontologies Daily News obituaries Obituary vocabulary Obituary vocabulary Obituary vocabulary Obituary vocabulary Obituary vocabularies

Future Work Should there be one global obituary ontology and/or one global burial vocabulary? (All languages and burial practices) GLOBAL Obituary Ontology

Future Work Or will the agent be smart enough to traverse the associated vocabulary for the correct information? Obituary vocabulary Obituary vocabulary Obituary vocabulary Obituary vocabulary Obituary vocabularies AGENT

Future Work How will the agents deliver the obituary extracted information? Obituary Extracted Database Daily News || 26 Jan 2004 || Charles Lambert || b. 12 June 1911 || d. 24 Jan 2004 HTML REPORT All Obituaries with surname LAMBERT URL’s to the actual Newspaper Obituaries Charles Lambert d. 24 Jan 2004 Richard Greaves Lambert d. 17 Oct 2003 Embedded Obituary Ontology Daily News obituaries

Future Work  Will it be necessary to hire and pay obituary indexers?  Will the newspapers continue to be microfilmed or just stored in HTML? Will storage space be an issue?  Will the whole process including information retrieval be cost effective?

QUESTIONS? COMMENTS?