COMP5338 – Advanced Data Models

Slides:



Advertisements
Similar presentations
Neo4j. План Cypher – Создание – Запросы Neo4j embedded in Java Немного о релизации (Neo4j Internals) – Native Graph Processing – Native Graph Storage.
Advertisements

XML: Extensible Markup Language
Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011.
RDF Tutorial.
© Copyright IBM Corporation 2014 Getting started with Rational Engineering Lifecycle Manager queries Andy Lapping – Technical sales and solutions Joanne.
Jennifer Widom NoSQL Systems Overview (as of November 2011 )
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
Graph Data Management Lab, School of Computer Scalable SPARQL Querying of Large RDF Graphs Xu Bo
Graph databases …the other end of the NoSQL spectrum. Material taken from NoSQL Distilled and Seven Databases in Seven Weeks.
Neo4j Adam Foust.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.
Titan Graph Database Meet Bhatt(13MCEC02).
AN INTRODUCTION TO NOSQL DATABASES Karol Rástočný, Eduard Kuric.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
SPARQL All slides are adapted from the W3C Recommendation SPARQL Query Language for RDF Web link:
Systems analysis and design, 6th edition Dennis, wixom, and roth
Practical RDF Chapter 1. RDF: An Introduction
Logics for Data and Knowledge Representation
Chapter 3 Querying RDF stores with SPARQL. Why an RDF Query Language? Why not use an XML query language? XML at a lower level of abstraction than RDF.
Mastering Neo4j A Graph Database Data Masters. Special Thanks To… Planet Linux Caffe
G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs Sherif SakrSameh ElniketyYuxiong He NICTA & UNSW Sydney, Australia Microsoft Research Redmond,
KIT – University of the State of Baden-Württemberg and National Large-scale Research Center of the Helmholtz Association Institute of Applied Informatics.
SPARQL W3C Simple Protocol And RDF Query Language
COMP5338 – Advanced Data Models
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
Practical RDF Chapter 10. Querying RDF: RDF as Data Shelley Powers, O’Reilly SNU IDB Lab. Hyewon Lim.
Chuck Olson Software Engineer October 2015 Graph Databases and Java 1.
Dr. Lowell Vizenor Ontology and Semantic Technology Practice Lead Alion Science and Technology Semantic Technology: A Basic Introduction.
The Structure of the Web. Getting to knowing the Web How big is the web and how do you measure it? How many people use the web? How many use search engines?
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Dr. Bhavani Thuraisingham September 24, 2008 Building Trustworthy Semantic Webs Lecture #9: RDF and RDF Security.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Session 1 Module 1: Introduction to Data Integrity
Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard.
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lotzi Bölöni.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
NoSQL: Graph Databases. Databases Why NoSQL Databases?
Graph and RDF Databases Context : Course of Advanced Databases Prepared by : Nassim BAHRI Nabila HOSNI February 19th, 2015.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
Graph Database - Neo4j ISQS3358, Spring Graph Database A graph database is a database that uses graph structures for semantic queries with nodes,
FACIAL RE-RECOGNITION Jared Beekman Mentor: Dr. Paul Bender.
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. How Can RDF and OWL Coexist with Property Graph Zhe Wu Architect Oracle Spatial and.
1 Analysis on the performance of graph query languages: Comparative study of Cypher, Gremlin and native access in Neo4j Athiq Ahamed, ITIS, TU-Braunschweig.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Linked Data & Semantic Web Technology The Semantic Web Part 4. Resource Description Framework (1) Dr. Myungjin Lee.
NoSQL: Graph Databases
Neo4j: GRAPH DATABASE 27 March, 2017
CS 405G: Introduction to Database Systems
NoSQL: Graph Databases
and Big Data Storage Systems
Introduction to Persistent Identifiers
SPARQL.
Every Good Graph Starts With
Graph Database.
Probabilistic Data Management
David Ostrovsky | Couchbase
NOSQL databases and Big Data Storage Systems
Selecting the right database for your semantic storage needs
NoSQL Systems Overview (as of November 2011).
Lu Xing CS59000GDM Sept 7th, 2018.
Graphs.
A gentle introduction to graph databases
Information Networks: State of the Art
Presentation transcript:

COMP5338 – Advanced Data Models Week 6: Graph Data Model and Introduction to Neo4j

Outline Brief Review of Graphs Examples of Graph Data Graph Engines and Graph Databases Introduction to Neo4j COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Graphs A graph is just a collection of vertices and edges Vertex is also called Node Edge is also called Arc/Link COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Type of Graphs Undirected graphs Directed graphs Edges have no orientation (direction) (a, b) is the same as (b, a) Directed graphs Edges have orientation (direction) (a, b) is not the same as (b, a) COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Representing Graph Data Data structures used to store graphs in programs Adjacency list Adjacency matrix COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Adjacency List Directed Graph Undirected Graph COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Adjacency matrix – Directed Graph Undirected Graph COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Outline Brief Review of Graphs Examples of Graph Data Graph Engines and Graph Databases Introduction to Neo4j COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Examples of graphs Social graphs Computer Network topologies Organization structure Facebook, LinkedIn, etc. Computer Network topologies Data centre layout Network routing tables Road, Rail and Airline networks Biological data COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Social Graphs and extension A small social graph Related information can be captured in the graph Page 2 of the graph database book Page 3 of the graph database book COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Social Graph with Various Relationships Page 19 of the graph database book COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Transaction information Page 23 of the graph database book COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Outline Brief Review of Graphs Examples of Graph Data Graph Engines and Graph Databases Introduction to Neo4j COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Typical way to store connected data Who are Bob’s friends? SELECT p1.Person FROM Person p1 JOIN PersonFriend pf ON pf.FriendID = p1.ID JOIN Person p2 ON pf.PersonID = p2.ID WHERE p2.Person = ”Bob” Page 13 of the graph database book COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

RDBMS to store Graphs Who are friends with Bob? SELECT p1.Person FROM Person p1 JOIN PersonFriend pf ON pf.PersonID = p1.ID JOIN Person p2 ON pf.FriendID = p2.ID WHERE p2.Person = ”Bob” COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

RDBMS to store Graphs Who are Alice’s friends-of-friends? SELECT p1.Person AS PERSON, p2.Person AS FRIEND_OF_FRIEND FROM PersonFriend pf1 JOIN Person p1 ON pf1.PersonID = p1.ID JOIN PersonFriend pf2 ON pf2.PersonID = pf1.FriendID JOIN Person p2 ON pf2.FriendID = p2.ID WHERE p1.Person = ”Alice” AND pf2.FriendID <> p1.ID COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Graph Technologies Graph Processing Graph Databases take data in any input format and perform graph related operations OLAP – OnLine Analysis Processing of graph data Google Pregel, Apache Giraph Graph Databases manage, query, process graph data support high-level query language native storage of graph data OLTP – OnLine Transaction Processing possible OLAP – also possible COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Data Models of Graph Databases RDF (Resource Description Framework) Model Express node-edge relation as “subject, predicate, object” triple (RDF statement) SPARQL query language Examples: AllegroGraph, Apache Jena Property Graph Model Express node and edge as object like entities, both can have properties Various query language Examples Apache Titan Support various NoSQL storage engine: BerkeleyDB, Cassandra, HBase Structural query language: Gremlin Neo4j Native storage manager for graph data (Index-free Adjacency) Declarative query language: Cypher query language COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Resource Description Framework Proposed in 1999 by W3C to describe resources on the web as part of semantic web activity Designed to be read and understood by computers A major use case is to write ontologies and to support inference RDF identifies resources using URI and describe resources with properties and property values Descriptions are a collection of triples Subject – what does it describe (the resource) Predicate – what property of the subject Object – what is the value of the property Example – description of a web page Subject – index.html Predicate – creation date Object – 3 September, 2013 index.html has a creation-date of 3 September, 2013 COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

RDF/XML An XML syntax for RDF Describes the following: <xml version=”1.0”?> <rdf:RDF xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:exterms=”http://www.example.org/terms”> <rdf:Description rdf:about=”http://www.example.org/index.html”> <exterms:creation-date>September 3, 2013</exterms:creation-date> </rdf:Description> </rdf:RDF> Describes the following: http://www.example.org/index.html has a creation-date of September 3, 2013 COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

RDF/Triples <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <foaf:Person rdf:nodeID=“Pi"> <foaf:name>Pi</foaf:name> <foaf:age>14</foaf:age> <foaf:knows> <foaf:Person rdf:nodeID=“R.Parker”> <foaf:name>Richard Parker</foaf:name> </foaf:Person> </foaf:knows> </rdf:RDF> RDF based graph storage system storage and retrieval Triples using SQL-like declarative query language SPARQL Four Triples Pi foaf:name “Pi” Pi foaf:age “14” Pi foaf:knows R.Parker R.Parker foaf:name “Richard Parker” COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Property Graph Model Proposed by Neo technology No standard definition or specification Both Nodes and Edges can have property RDF model cannot express edge property in a natural and easy to understand way The actual storage varies The query language varies since: 1 year common: sailing COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Outline Brief Review of Graphs Examples of Graph Data Graph Engines and Graph Databases Introduction to Neo4j COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Neo4j Native graph storage using property graph model Index-free Adjacency Nodes and Relationships are stored Supports property indexes Replication Single master-multiple slaves replication Neo4j cluster is limited to master/slave replication configuration Database engine is not distributed Cypher – query language Also supports SPARQL We will discuss the internals and indexing next week COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Property Graph Model Property graph has the following characteristics It contains nodes and relationships Nodes contain properties Properties are stored in the form of key-value pairs A node can have labels Relationships connect nodes Has a direction, a type/name, start node, end node No dangling relationships (can’t delete node with a relationship) Properties Both nodes and relationships have properties Useful in modeling and querying based on properties of relationships http://docs.neo4j.org/chunked/milestone/graphdb-neo4j.html COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Property Graph ModelExample COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Property Graph Model: Nodes http://docs.neo4j.org/chunked/milestone/graphdb-neo4j-nodes.html COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Property Graph Model: Relationships http://docs.neo4j.org/chunked/milestone/graphdb-neo4j-relationships.html COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Property Graph Model: Properties http://docs.neo4j.org/chunked/milestone/graphdb-neo4j-properties.html COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Property Graph Model: Labels A node may be labeled with any number of labels, including none, making labels an optional addition to the graph http://docs.neo4j.org/chunked/milestone/graphdb-neo4j-labels.html#_label_names COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Property Graph Model: Paths A path is one or more nodes with connecting relationships, typically retrieved as a query or traversal result. A path of length zero A path of length one COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Cypher Cypher is a query language specific to Neo4j Easy to read and understand It uses patterns to represents core concepts in the property graph model It uses clauses to build queries; Certain clauses and keywords are inspired by SQL COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Cypher patterns: node A single node Related nodes Labels  A node is described using a pair of parentheses, and is typically given an identifier E.g.: (n) means a node n Related nodes Related nodes are expressed using an arrow between two nodes E.g. (a)-->(b); (a)-->(b)<--(c); or (a)-->()<--(c) Labels Label(s) can be attached to a node E.g.: (a:User)-->(b) Specifying properties Properties are a list of name value pairs enclosed in a curly brackets E.g.: (a { name: "Andres", sport: "Brazilian Ju-Jitsu" }) http://neo4j.com/docs/stable/introduction-pattern.html COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Cypher patterns: relationships Basic Relationships Directions are not important: (a)--(b) Named relationship: (a)-[r]->(b) Named and typed relationship: (a)-[r:REL_TYPE]->(b) Specifying Relationship that may belong to one of a set of types: (a)-[r:TYPE1|TYPE2]->(b) Typed but not named relationship: (a)-[:REL_TYPE]->(b) Relationship of variable lengths (a)-[*2]->(b) describes a path of length 2 between node a and node b (a)-[*3..5]->(b) describes a path of minimum length of 3 and maximum length of 5 between node a and node b Either bound can be omitted (a)-[*3..]->(b), (a)-[*..5]->(b) Both bounds can be omitted as well (a)-[*]->(b) COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Cypher Clauses - Create Create nodes or relationships with properties Create a node matrix1 with the label Movie CREATE (matrix1:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'}) Create a node keanu with the label Actor CREATE (keanu:Actor {name:'Keanu Reeves', born:1964}) Create a relationship ACTS_IN CREATE (keanu)-[:ACTS_IN {roles:'Neo'}]->(matrix1) COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Cypher – Read MATCH … RETURN Return all nodes MATCH (n) RETURN n Return all nodes with a given label MATCH (movie:Movie) RETURN movie Return all actors in the movie “The Matrix” MATCH (a:Actor)-[:ACTS_IN]->(m:Movie{title:"The Matrix"}) RETURN a.name http://docs.neo4j.org/chunked/milestone/query-match.html COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Cypher - Update MATCH … SET/REMOVE … RETURN Set all the age property for all actor nodes MATCH (n:Actor) SET n.age = 2014 - n.born RETURN n Remove a property REMOVE n.age Remove a label MATCH (n:Actor{name:"Keanu Reeves"}) REMOVE n:Actor COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Cypher - Delete MATCH … DELETE Delete relationship MATCH (n{name:"Keanu Reeves"})-[r:ACTS_IN]->() DELETE r Delete a node and all possible relationship MATCH (m{title:'The Matrix'})-[r]-() DELETE m,r COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

Upcoming Assessment Quiz tonight in the lab! Group assignment instruction can be viewed in Elearning and edlms The data set can be downloaded from ELearning Work in group of up to 3 students Both design and implementation parts Submit a report and demo your system On week 10 You can form group now An Elearning link will be available for group sign up soon COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)

References Ian Robinson, Jim Webber and Emil Eifrem, Graph Databases, Second Edition, O’Reilly Media Inc., June 2015 You can download this book from the Neo4j site, http://www.neo4j.org/learn will redirect you to http://graphdatabases.com/ The Neo4j Manual v2.2.5 The Neo4j Graph Database (http://neo4j.com/docs/stable/graphdb-neo4j.html) Cypher Query Language (http://neo4j.com/docs/stable/cypher-query-lang.html) Noel Yuhanna, Market Overview: Graph Databases, Forrester White Paper, May, 2015 Renzo Angeles, A Comparison of Current Graph Data Models, ICDE Workshops 2013 (DOI-10.1109/ICDEW.2012.31) Renzo Angeles and Claudio Gutierrez, Survey of Graph Database Models, ACM Computing Surveys, Vol. 40, N0. 1, Article 1, February 2008 (DOI-10.1145/1322432.1322433) COMP5338 "Advanced Data Models" - 2015 (A. Fekete & Y. Zhou)