Managing Semi-Structured Data. Is the web a database?

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
DCMI Workshop on Metadata and Search Vendor Panel Presentation Bradley P. Allen
ACACIA in short… Objectives: Offer methodological and software support (i.e. models, methods and tools) for construction, management and diffusion of.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
The Semantic Web. The Web Today Designed for Human to read Cannot express meaning Architecture: URL –Decentralized: Link structure Language: html.
Progress Update Semantic Web, Ontology Integration, and Web Query Seminar Department of Computing David George.
Dynamic Contextual eLearning – Dynamic Content Discovery, Capture and Learning Object Generation from Open Corpus Sources Shay Lawless, Knowledge & Data.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
The Web of data with meaning... By Michael Griffiths.
Ontologies and the Semantic Web by Ian Horrocks presented by Thomas Packer 1.
CS652 Spring 2004 Summary. Course Objectives  Learn how to extract, structure, and integrate Web information  Learn what the Semantic Web is  Learn.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
© 1Spatial All rights reserved. An Internet of Places Making Location Data Pervasive Paul Watson Giuseppe Conti* Federico Prandi*
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Database Environment 1.  Purpose of three-level database architecture.  Contents of external, conceptual, and internal levels.  Purpose of external/conceptual.
ICS-FORTH May 25, The Utility of XML Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Heraklion, May.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
2 1 Chapter 2 Data Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
Interoperability in Information Schemas Ruben Mendes Orientador: Prof. José Borbinha MEIC-Tagus Instituto Superior Técnico.
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Ontoprise: B 3 - Semantic B2B Broker whitepaper review Bernhard Schueler CSCI 8350, Spring 2002,UGA.
An Aspect of the NSF CDI InitiativeNSF CDI: Cyber-Enabled Discovery and Innovation.
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
Semantic Web - an introduction By Daniel Wu (danielwujr)
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Interoperability & Knowledge Sharing Advisor: Dr. Sudha Ram Dr. Jinsoo Park Kangsuk Kim (former MS Student) Yousub Hwang (Ph.D. Student)
Data Mining for Web Intelligence Presentation by Julia Erdman.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
You sexy beast. Ok, inappropriate. How about: Web of links to Web of Meaning Hello Semantic Web!
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
Metadata Schema for CERIF Andrei Lopatenko Vienna University of Technology
1 Chapter 1 Introduction to Databases Transparencies.
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
The future of the Web: Semantic Web 9/30/2004 Xiangming Mu.
OWL Representing Information Using the Web Ontology Language.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Knowledge Modeling and Discovery. About Thetus Thetus develops knowledge modeling and discovery infrastructure software for customers who: Have high-value.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
From XML to DAML – giving meaning to the World Wide Web Katia Sycara The Robotics Institute
The Application of Semantic Technologies to Scientific Archives J. Steven Hughes Daniel J. Crichton J. Steven Hughes Daniel J. Crichton Science Archives.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
An Ontological Approach to Financial Analysis and Monitoring.
Chapter 7 K NOWLEDGE R EPRESENTATION, O NTOLOGICAL E NGINEERING, AND T OPIC M APS L EO O BRST AND H OWARD L IU.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Semantic Web 06 T 0006 YOSHIYUKI Osawa. Problem of current web  limits of search engines Most web pages are only groups of character strings. Most web.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
The Semantic Web & Content Managment Systems Ole Gulbrandsen, CTO Stand: E7049.
SEMANTIC WEB Presented by- Farhana Yasmin – MD.Raihanul Islam – Nohore Jannat –
BBY 464 Semantic Information Management (Spring 2016) Semantic Query Languages Yaşar Tonta & Orçun Madran [yasartonta, Hacettepe.
David Huynh, Stefano Mazzocchi, David Karger Piggy Bank: Experience the Semantic Web inside your web browser Web Semantics: Science, Services and Agents.
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Semantic Database Builder
ece 627 intelligent web: ontology and beyond
Presentation transcript:

Managing Semi-Structured Data

Is the web a database?

Rules—What Rules? Easy to create web informationEasy to create web information Cannot all be stored in relational databasesCannot all be stored in relational databases Cannot be queried in traditional waysCannot be queried in traditional ways “The web changed the digital information rules.”

Semi-structured Data Fully structured dataFully structured data –Databases –Hidden web Fully unstructured data—ordinary textFully unstructured data—ordinary text Semi-structured data—the grey area in betweenSemi-structured data—the grey area in between –No “good solutions;” no good “software, tools, or methodologies to manipulate [semi-structured data]” –“[Researchers] don’t even agree on the shape of the problem—much less, good approaches to solving it.”

Nature of the Problem Information embedded in textInformation embedded in text –Keyword search insufficient to answer queries –Natural language processing also insufficient Lack of agreement of vocabularies and schemasLack of agreement of vocabularies and schemas –“Reaching schema agreements among different communities is one of the most expensive steps in software design.” –“We need to be able to process information without requiring … a priori schema and vocabulary agreements among participants.”

Example: eBay “Impossible for … developers to define an a priori schema for the information.”“Impossible for … developers to define an a priori schema for the information.” “Information stored in raw text and searched using only keywords, significantly limiting its usability.”“Information stored in raw text and searched using only keywords, significantly limiting its usability.” “Some standard entities (e.g., buyer, date, ask, bid …), but the meat of the information—the item descriptions—has a rich and evolving structure that isn’t captured.”“Some standard entities (e.g., buyer, date, ask, bid …), but the meat of the information—the item descriptions—has a rich and evolving structure that isn’t captured.”

Why Schemas? “Schemas assign meaning to the data and … allow automatic data search, comparison, and processing.”“Schemas assign meaning to the data and … allow automatic data search, comparison, and processing.” Hierarchy of meaningHierarchy of meaning –Raw text: strings (values) –Data: attribute-value pairs –Information: data in a conceptual framework –Knowledge: information with a degree of certainty or community agreement –Meaning: knowledge that is relevant or activates “We have to learn to use and exploit schemas as helpers, but not rely on their existence or allow them to be constraining factors.”“We have to learn to use and exploit schemas as helpers, but not rely on their existence or allow them to be constraining factors.”

Schema-Agnostic Tools Information retrieval (sophisticated search engines?)Information retrieval (sophisticated search engines?) –Find (maybe?) but not answer –No DB-like query logic, updates, transactions XMLXML –XML data can exist w/wo schemas; schemas can be defined before or after –Mixed text/data content –Languages for query (XQuery) and transformation (XSLT) OWL & RDFOWL & RDF –RDF: subject-predicate-object triples –OWL: ontological descriptions usually over RDF triples –Classification & inferencing –Semantic annotation and tagging Possible Places to Start

Are We Stuck? Better information-authoring tools (annotation assistance)Better information-authoring tools (annotation assistance) Information extraction (automatic annotation)Information extraction (automatic annotation) Creation and reuse of standard schemas and vocabularies (ontology generation)Creation and reuse of standard schemas and vocabularies (ontology generation) Mapping schemas to each other (schema mapping)Mapping schemas to each other (schema mapping) Automatic data linking (data linking & merging)Automatic data linking (data linking & merging) Automatic processing of semi-structured data (free-form queries)Automatic processing of semi-structured data (free-form queries) What’s Next? – Florescu (Embley)

Dataspace System Supports data and applications in a wide variety of formats all within a dataspace.Supports data and applications in a wide variety of formats all within a dataspace. Offers an integrated means of searching, querying, updating, and administering the dataspace.Offers an integrated means of searching, querying, updating, and administering the dataspace. Has varying levels of service (e.g. “best-effort” or approximate answers)Has varying levels of service (e.g. “best-effort” or approximate answers) Includes tools to create tighter integration of the data, as necessary.Includes tools to create tighter integration of the data, as necessary. What’s beyond a database system? – Franklin, Halevy, Maier

“We are still at day one.” “We need to find a compromise to the tension between the advantages of having schemas, in terms of better understanding and automatically processing the data, and disadvantages imposed by schemas, in terms of inflexibility and lack of evolution.” – Florescu